View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000881 | Main CAcert Website | website content | public | 2010-10-15 08:47 | 2011-06-22 00:09 |
Reporter | edgarwahn | Assigned To | |||
Priority | normal | Severity | minor | Reproducibility | have not tried |
Status | new | Resolution | open | ||
Summary | 0000881: crawlers have permission to crawl through the cacert website | ||||
Description | Andreas noted that there is no robots.txt to disallow crawling. robots.txt created in /www nearby index.php. User-agent: * Disallow: / | ||||
Tags | No tags attached. | ||||
Reviewed by | |||||
Test Instructions | |||||
|
added file, exported to cacert1 |
|
patch: git diff c6ed18141161adf6b17ea07d9c6a8eeb37f6eaa1..00e56b6ea045915eb2869688d54fe7d05661faf2 |
|
The above robots.txt may not be appropriate for production it was a quick hack to preserve the ressources of the test-system which has only a low bandwidth connection at the moment. If indexing of the main site is on purpose we should supply an explicit robots.txt with allowance so it's documented. |
|
We should not stop crawling for the production system, we should limit it properly. The above mentionened robots.txt is not adequat. |
Date Modified | Username | Field | Change |
---|---|---|---|
2010-10-15 08:47 | edgarwahn | New Issue | |
2010-10-15 08:56 | edgarwahn | Note Added: 0001751 | |
2010-10-15 08:59 | edgarwahn | Note Added: 0001753 | |
2010-10-15 13:14 | Andreas Baess | Note Added: 0001754 | |
2010-10-17 10:42 | Sourcerer | Note Added: 0001757 | |
2011-06-22 00:09 | NEOatNHNG | Source_changeset_attached | => cacert-devel master 6bc7253c |