Google On Discovered – Currently Not Indexed

0
Google On Discovered – Currently Not Indexed

Google’s Martin Splitt posted a video in his SEO Made Easy series on the topic of the Google Search Console “Discovered – Currently Not Indexed” page indexing report status note. In short, there are three primary reasons you’d see pages in this category, they are:

(1) Quality issues with those pages

(2) Your server is slow for Googlebot

(3) Google just needs more time to index those pages (may be related to #2 above).

On the quality issue, Martin Splitt said, “When Google Search notices a pattern of low quality or thin content on pages, they might be removed from the index and might stay in discovered.”
“Googlebot knows about these pages but is choosing not to proceed with them,” because they are not high quality enough, he explained. He added, “If Google Search detects a pattern in URLs with low-quality content on your site, it might skip these URLs altogether, leaving them in is discovered as well.”

What can you do? “If you care about these pages you might want to rework the content to be of higher quality and make sure your internal linking relates this content to other parts of your existing content,” he said. So make sure to look at the content and improve it but also see what pages you can link that content to from other pages that are already indexed.

To be clear, Google’s help documentation for discovered – currently not indexed only really mentions server issues. It reads:

The page was found by Google, but not crawled yet. Typically, Google wanted to crawl the URL but this was expected to overload the site; therefore Google rescheduled the crawl. This is why the last crawl date is empty on the report.

But as we covered back in 2018, we know it is also about quality issues. So this is not new, but it is nice to have a video on this.

Here is the video:

Here is a screenshot of this page indexing report with the “Discovered – Currently Not Indexed” for this site:

Google Search Console Page Indexing Report

Here is the transcript:

Google Video On Discovered – Currently Not Indexed

Today, we will dive into Google Search Console’s “Discovered – currently not indexed” status in the page indexing report.

When using Google Search Console, and you should use it, you probably went into the page indexing report and perhaps saw these kinds of reasons for pages not being indexed. One of the most frequent questions we’re getting about this is the discovered currently not indexed status let’s see what it means and what you could do about it.

First and foremost, Google will almost never index all content from a site. This isn’t an error and not even necessarily a problem that needs looking into. It’s a note on the status of these pages mentioned there. To understand what this means we need to look at how a page proceeds through the systems and processes that make up Google Search.

At the very beginning, Googlebot finds a URL somewhere that can be a sitemap or a link for example. Googlebot has now discovered that this URL exists. Google bot basically puts it into a to-do list of URLs to visit and possibly index later on. In an ideal world, Googlebot would immediately get to work on this URL but as you probably know from your own to-do list that isn’t always possible. And that’s the first reason why you might see this in Google Search Console. Googlebot simply didn’t get around to crawling the URL yet as it was busy with other URLs. So sometimes it’s just a matter of a bit more patience on your end to get this result. Eventually Googlebot might get around to crawling it. That’s the moment when it fetches the page from your server and processes it further to potentially index it. Once it gets to crawling the URL would move on to the crawled currently not indexed or the page gets indexed.

But what if it doesn’t get crawled and stays in discovered not indexed? Well that usually either has to do with your server or with your website’s quality.

Let’s look at potential technical reasons first. Say you have a webshop and just added 1,000 new products. Googlebot discovers all these products at the same time and would like to crawl them. In previous crawls, however, it has noticed that your server gets really slow or even overwhelmed when it tries to crawl more than 10 products at the same time. It wants to avoid overwhelming your server so if it decides to crawl it might do so over a longer period of time, say 10 products at a time over a few hours, rather than all the thousand products within the same hour. That means that not all 1,000 products get crawled at the same time. Googlebot will take longer to get around these products then.

It makes sense to look at the crawl stats report and the reply section in there to see if your server responds slowly or with HTTP 500 errors when Googlebot tries to crawl. Note that this usually only matters for sites with very large amounts of pages, say millions or more, but server issues can happen with smaller sites too/ It makes sense to check with your hosting company what to do to fix these performance issues if they arise.

The other far more common reason for pages staying in discovered currently not indexed is quality though. When Google Search notices a pattern of low-quality or thin content on pages, they might be removed from the index and might stay in discovered. Googlebot knows about these pages but is choosing not to proceed with them. If Google Search detects a pattern in URLs with low-quality content on your site, it might skip these URLs altogether, leaving them in is discovered as well.

If you care about these pages you might want to rework the content to be of higher quality and make sure your internal linking relates this content to other parts of your existing content. See our episode on internal linking for more information on this.

So in summary, some sites will have some pages that won’t get indexed and that’s usually fine. If you think a page should be indexed then you should consider checking the quality of the content on these pages that stay in discovered currently not indexed. Make sure, as well, that your server isn’t giving Googlebot signals that it is overwhelmed when it’s crawling.

Forum discussion at X.

FOLLOW US ON GOOGLE NEWS

 

Read original article here

Denial of responsibility! Search Engine Codex is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

0
9370 posts 0 comments
You might also like More from author
Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More