Amazon Web Services S3 Blocked Googlebot In June

0
Amazon Web Services S3 Blocked Googlebot In June

Back in mid-June, I noticed that Google was not showing many of my images in Google Search and Discover and also some readers were pointing it out to me. So I used the handy Google Search Console URL Inspection tool to find out those S3 URLs I was using to host my images were blocking Googlebot from crawling. Here is a bit of a case study from yours truly of an indexing/crawling issue I had for my image URLs.

This AWS bug led to an 83% drop in the impressions my images were getting from Google Search and Google Images. It led to a 76% drop in image search related clicks to this site. I am still down several weeks later by about 16% in impressions and 26% in clicks from image search but it is a huge improvement.

Here is the Google Search Console Search Performance report showing the impressions and clicks chart over time. You will see the drop around June 15th, then it start to pick back up around July 8th. You will also see that my image traffic has still not fully returned to its normal numbers pre-AWS bug, even after two months:

Google Search Console Performance Images

When Googlebot was trying to access my image URLs on S3, Google was getting a 404 not found error. But when I visited the URLs with my computer, they loaded just fine. These are the same image URLs I have been using on this site for well over a decade and poof, one day, AWS decided to block Googlebot. I reached out to both Google and AWS about the issue and I suspect it was a pretty big issue. Tons of sites use S3 for image and file storage, so Googlebot was likely getting tons of 404 errors. The weird part is that I saw zero public complaints about the issue.

In any event, this is what Googlebot saw when they tried to crawl those URLs:

Google Rich Result Url Blocked

AWS fixed it after several days:

Google Rich Result Url Unblocked

This is what my images looked like in the URL Inspection tool in Google Search Console:

Gsc Url Inspec Broken Images

It should look something like this:

Gsc Url Inspec Working Images

Since then, I decided to move my images to AWS’s CloudFront – a service that was not available when I first made this site – which is why I used S3 back then for images. The S3 issue with Googlebot is still fixed and working fine. But I am not going back to S3 for images.

I should thank Glenn Gabe for also noticing the images going away early on in Google Discover. Glenn also wrote up this image migration article which I reviewed before making the switch from AWS S3 to AWS CloudFront. I did not migrate my old images, I left them, because AWS fixed the issue. But since late June, all my new images are using CloudFront.

To be clear, this was not a Google bug, but an AWS change that led to AWS S3 blocking Googlebot. It is now resolved but it seems like the damage has been done… If the graphs change more, I will update this story below to document the changes. But so far, it has been flat for the past 5 weeks or so, so I am not expecting big changes in the future.

Forum discussion at X.

FOLLOW US ON GOOGLE NEWS

 

Read original article here

Denial of responsibility! Search Engine Codex is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

0
9408 posts 0 comments
You might also like More from author
Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More