This project is read-only.

google crawl error

Topics: General
Oct 19, 2012 at 9:46 PM
Edited Oct 19, 2012 at 10:00 PM


today I looked at the results of google webmaster tools and I noticed that google had crawling errors. A few of them were correct, however, some I could not interprete like 


I ran a test and noticed that pages could be retrieved via the "Content/Item/Display/xyz" url where xyz is the id of the content item. 

I understand (conceptually) all the url rewriting done in Orchard, but I wonder how google crawler managed to pick this one up?

Do I need to add a "Disalow: /Contents" in the robots.txt or will it break other stuff? Or is there another way to look at it?

thnx already

EDIT: I just found out that google found this url from following ".../rss?projection=31". Maybe it's "rss" that should be disalowed in the robots.txt?

Oct 19, 2012 at 9:52 PM

There has got to be a link to those somewhere. Doing something in robots.txt sounds like a great idea.