View Issue Details

IDProjectCategoryView StatusLast Update
0000881Main CAcert Websitewebsite contentpublic2011-06-22 00:09
Reporteredgarwahn Assigned To 
PrioritynormalSeverityminorReproducibilityhave not tried
Status newResolutionopen 
Summary0000881: crawlers have permission to crawl through the cacert website
DescriptionAndreas noted that there is no robots.txt to disallow crawling.

robots.txt created in /www nearby index.php.

User-agent: *
Disallow: /
TagsNo tags attached.
Reviewed by
Test Instructions

Activities

edgarwahn

2010-10-15 08:56

developer   ~0001751

added file, exported to cacert1

edgarwahn

2010-10-15 08:59

developer   ~0001753

patch: git diff c6ed18141161adf6b17ea07d9c6a8eeb37f6eaa1..00e56b6ea045915eb2869688d54fe7d05661faf2

Andreas Baess

2010-10-15 13:14

developer   ~0001754

The above robots.txt may not be appropriate for production it was a quick hack to preserve the ressources of the test-system which has only a low bandwidth connection at the moment.

If indexing of the main site is on purpose we should supply an explicit robots.txt with allowance so it's documented.

Sourcerer

2010-10-17 10:42

administrator   ~0001757

We should not stop crawling for the production system, we should limit it properly. The above mentionened robots.txt is not adequat.

Issue History

Date Modified Username Field Change
2010-10-15 08:47 edgarwahn New Issue
2010-10-15 08:56 edgarwahn Note Added: 0001751
2010-10-15 08:59 edgarwahn Note Added: 0001753
2010-10-15 13:14 Andreas Baess Note Added: 0001754
2010-10-17 10:42 Sourcerer Note Added: 0001757
2011-06-22 00:09 NEOatNHNG Source_changeset_attached => cacert-devel master 6bc7253c