robots.txt

Topics: Administration
Mar 29, 2011 at 6:37 PM

Although I searched the wiki I did not find a suggestion for a minimal robots.txt

Are there any recomendations for specific excludes of orchard directories and/or files for a robots.txt?

 

Coordinator
Mar 29, 2011 at 6:46 PM

Did you try this module? http://orchardproject.net/gallery/List/Modules/Orchard.Module.SH.Robots

Mar 29, 2011 at 7:27 PM

kinda funy peculiar that right after I installed it, receiving the error below, the documentation link is giving me a 503 error.

Successfully added 'Orchard.Module.SH.Robots 1.0.0' to D:\inetpub\wwwroot\Spaces\jeffa\blogificating.com\wwwroot\
The module has been successfully installed. Its features can be enabled in the "Configuration | Features" page accessible from the menu.
Error loading extensions from gallery source 'Orchard Extensions Gallery'. An error occurred while processing this request..
I enabled the module and saved the presented sugestion which was the same as when I access http://www.blogificating.com/robots.txt it displays:
User-agent: *
Allow: /
Looks like it's working. Will attempt to R TFM later...
Oct 4, 2012 at 6:16 AM

I would like to know whether there's a set of recommendations towards robots.txt as well.  I am using the aforementioned module and it runs fine, it simply suggests the following:

User-agent: *
Allow: /

Which I would think is as much as allow any User Agent, but disallow whatever crawling if I am correct?  Does this still allow crawling of a sitemap.xml when making use of the Advanced Sitemap module as per http://gallery.orchardproject.net/List/Modules/Orchard.Module.WebAdvanced.Sitemap?

Oct 4, 2012 at 11:16 AM

Yes, that set of robots.txt rules you pasted will allow crawling of /sitemap.xml. They will also allow crawling by any set of user agents (so your statement "but disallow whatever crawling" is incorrect). 

Jun 10, 2013 at 8:52 PM
Here are the ones we currently exclude:

Disallow: /Users/
Disallow: /Admin/
Disallow: /core/
Disallow: /Modules/
Disallow: /Packaging/
Disallow: /Themes/
Disallow: /Projector/
Disallow: /Media/

I may be missing a couple still.