Google’s John Mueller said in an SEO hangout last Friday that it is impossible for Google to understand that one piece of content is equivalent to another piece of content when those content pieces are in different languages. So Google is basically trusting the hreflang attribute provided by publishers.
Google’s John Mueller was asked how does Google measure the similarity of pages at the 26:28 mark into this video. John said “we don’t.” Google just uses the “hreflang to understand which of these URLs are equivalent from your point of view and we will swap those out,” John said. John added “for hreflang, I think it’s impossible for us to understand that this specific content is equivalent for another country or another language.” John basically said Google cannot understand this.
Here is the transcript:
AUDIENCE: OK, so how does Google measure the similarity of pages?
JOHN MUELLER: I think we don’t. I think we basically use the hreflang to understand which of these URLs are equivalent from your point of view. And we will swap those out.
AUDIENCE: Oh, OK, so not from the content point of view, maybe some–
JOHN MUELLER: No.
AUDIENCE: — similar content.
JOHN MUELLER: No, I — we would only do that for things like the rel canonical to understand what the canonical URL is. But for hreflang, I think it’s impossible for us to understand that this specific content is equivalent for another country or another language. Like, there are so many local differences that are always possible.