Google Reveals Two New Web Crawlers

By Chris Barnhart Last updated May 18, 2024

Google revealed details of two new crawlers that are optimized for scraping image and video content for “research and development” purposes. Although the documentation doesn’t explicitly say so, it’s presumed that there is no impact in ranking should publishers decide to block the new crawlers.

It should be noted that the data scraped by these crawlers are not explicitly for AI training data, that’s what the Google-Extended crawler is for.

GoogleOther Crawlers

The two new crawlers are versions of Google’s GoogleOther crawler that was launched in April 2023. The original GoogleOther crawler was also designated for use by Google product teams for research and development in what is described as one-off crawls, the description of which offers clues about what the new GoogleOther variants will be used for.

The purpose of the original GoogleOther crawler is officially described as:

“GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.”

Two GoogleOther Variants

There are two new GoogleOther crawlers:

GoogleOther-Image
GoogleOther-Video

The new variants are for crawling binary data, which is data that’s not text. HTML data is generally referred to as text files, ASCII or Unicode files. If it can be viewed in a text file then it’s a text file/ASCII/Unicode file. Binary files are files that can’t be open in a text viewer app, files like image, audio, and video.

The new GoogleOther variants are for image and video content. Google lists user agent tokens for both of the new crawlers which can be used in a robots.txt for blocking the new crawlers.

1. GoogleOther-Image

User agent tokens:

Newly Updated GoogleOther User Agent Strings

Google also updated the GoogleOther user agent strings for the regular GoogleOther crawler. For blocking purposes you can continue using the same user agent token as before (GoogleOther). The new Users Agent Strings are just the data sent to servers to identify the full description of the crawlers, in particular the technology used. In this case the technology used is Chrome, with the model number periodically updated to reflect which version is used (W.X.Y.Z is a Chrome version number placeholder in the example listed below)

The full list of GoogleOther user agent strings:

Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; GoogleOther)

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/W.X.Y.Z Safari/537.36

Google Reveals Two New Web Crawlers

GoogleOther Crawlers

Two GoogleOther Variants

1. GoogleOther-Image

2. GoogleOther-Video

Newly Updated GoogleOther User Agent Strings

GoogleOther Family Of Bots

Read the updated Google crawler documentation