Back in May Google’s Gary Illyes sat for an interview at the SERP Conf 2024 conference in Bulgaria and answered a question about the causes of crawled but not indexed, offering multiple reasons that are helpful for debugging and fixing this error.
Although the interview happened in May, the video of the interview went underreported and not many people have actually watched it. I only heard of it because the always awesome Olesia Korobka (@Giridja) recently drew attention to the interview in a Facebook post.
So even though the interview happened in May, the information is still timely and useful.
Crawled Currently Not Indexed is a reference to an error report in the Google Search Console Page Indexing report which alerts that a page was crawled by Google but was not indexed.
During a live interview someone submitted a question, asking:
“Can crawled but not indexed be a result of a page being too similar to other stuff already indexed?
So is Google suggesting there is enough other stuff already and your stuff is not unique enough?”
Google’s search console documentation doesn’t provide an answer as to why Google may crawl a page and not index it, so it’s a legitimate question.
Gary Illyes answered that yes, one of the reasons could be that there is already other content that is similar. But he also goes on to say that there are other reasons, too.
He answered:
“Yeah, that that could be one thing that it can mean. Crawled but not indexed is, ideally we would break up that category into more granular chunks, but it’s super hard because of how the data internally exists.
It can be a bunch of things, dupe elimination is one of those things, where we crawl the page and then we decide to not index it because there’s already a version of that or an extremely similar version of that content available in our index and it has better signals.
But yeah, but it it can be multiple things.”
Gary then called attention to another reason why Google might crawl but choose not to index a site, saying that it could be a site quality issue.
Illyes then continued his answer:
“And the general quality of the of the site, that can matter a lot of how many of these crawled but not indexed you see in search console. If the number of these URLs is very high that could hint at general quality issues.
And I’ve seen that a lot since February, where suddenly we just decided that we are indexing a vast amount of URLs on a site just because …our perception of the site has changed.”
Gary next offered other reasons for why URLs might be crawled but not indexed, saying that it could be that Google’s perception of the site could have changed but that it could be a technical issue.
Gary explained:
“…And one possibility is that when you see that number rising, that the perception of… Google’s perception of the site has changed, that could be one thing.
But then there could also be that there was an error, for example on the site and then it served the same exact page to every single URL on the site. That could also be one of the reasons that you see that number climbing.
So yeah, there could be many things.”
Gary provided answers that should help debug why a web page might be crawled but not indexed by Google.
Although Illyes didn’t elaborate on what he meant about another site with better signals, I’m fairly certain that he’s describing the scenario when a site syndicates its content to another site and Google chooses to rank the other site for the content and not the original publisher.
Watch Gary answer this question at the 9 minute mark of the recorded interview:
Featured Image by Shutterstock/Roman Samborskyi
Here is a recap of what happened in the search forums today, through the eyes…
Google's John Mueller said on that using the request indexing feature in Google Search Console's…
Google seems to have fixed the Google Search Console Performance Reports last night at about…
This week, I posted the monthly Google Webmaster report and we had more search ranking…
Google is testing a longer search bar, so I guess you can enter in a…
Google has made a small update to its image SEO best practices documentation, specific to…
This website uses cookies.
Leave a Comment