Matt Cutts explains in a video why Google identifies, but doesn't "fetch," a page when the Web page is blocked with a robots.txt tag. The tag, meant to prevent Google from indexing the page in
search engine result queries, is considered an uncrawled URL. He also explains why Google shows uncrawled URLs, pointing to www.dmv.ca.gov as an example. The quantity of links connecting to the URL
plays a part in the decision, he says.
Cutts says Google can go into the Open Directory Project to find information about the page. Google can return information about a page without violating the robot.txt. He says if you really don't want a page to index, let Google crawl it and use a no index metatag at the top of the Web page. He provides another solution as well.
Read the whole story at Matt CuttsL Gadgets, Google, and SEO »