What I leant today - robots.txt parsing order

For those looking at SEO conundrums.

So what I learnt today is that if you do :

#

User-agent: * Disallow: /a Disallow: /b

User-agent: Googlebot Disallow: /c

#

Google’s serach indexer will ignore the lines for User-agent: * and only read the ones specified specifically for itself. So /a, /b get indexed and /c does not

However :

#

User-agent: Googlebot Disallow: /c

User-agent: * Disallow: /a Disallow: /b

#

Then google will correctly not spider /a, /b or /c.

Although I can find no mention of this rule in ordering.

* UPDATE *

Ok, I’ve noted that Google does say this actually; and moreso that if you have a bot specific block then that bot will ignore the rules for all bots.

So I’ve had to go and paste rules into each and every specific bot section

But what I have learnt today (1 day later) is that if you leave blank lines in a user-agent block then some engines will disregard that instruction (Yandex). I also had fun reading translated russian webmaster guidelines

Pages

Recent Comments

  • nickh: Daniel is right here, asleep in his bouncy chair. Very read more
  • Caroline Yates: You should keep cight for personal stuff and funny stuff read more
  • Meri: Whoops, the first comment was meant to go to your read more
  • Meri: Main thing that I would say is before you give read more
  • Meri: Have you tried the daily posting feature? You can set read more
  • Carlos Contreras: Would you like to see my work? www.3dreamagic.com bye read more
  • kyle: hey way better cricket game out there!! www.stickcricket.com read more
  • andrew tomlinson: this game is wicked and adictive thanks for it. read more
  • Jay: Hello there, My name is jay,I love this game and read more
  • Mayuresh Kadu: Found you via geourl. Seems we live a few minutes read more
OpenID accepted here Learn more about OpenID

Adverts

Find recent content on the main index or look in the archives to find all content.