Did you know that Google Search checks about four billion host names each and every day for robots.txt purposes? Gary Illyes said in the December Search Off The Record podcast “we have about four billion host names that we check every single day for robots.txt.”
He said this at the 20:31 mark in the video. He said if they check four billion host names daily, then “the number of sites is probably over or very likely over four billion.”
I spotted this video Glenn Gabe:
Google’s Gary Illyes in the latest SOTR Podcast: Google has about 4 billion hostnames that it checks every single day for robots.txt https://t.co/Irc2outOM4 pic.twitter.com/lyb68pnR7d
— Glenn Gabe (@glenngabe) December 22, 2023
Here is the transcript:
GARY ILLYES: Yeah, and I mean, that’s one of the problems that we brought up early on. If we implement something or if we come up or suggest something that could work, that should not put more strain on publishers because if you think about it, if you go through our robots.txt cache, you can see that we have about four billion host names that we check every single day for robots.txt. Now, let’s say that all of those have subdirectories, for example. So the number of sites is probably over or very likely over four billion.
JOHN MUELLER: How many of those are in Search Console? I wonder.
GARY ILLYES: John, stop it.