Google’s 14,000 Search Ranking Features Leaked Through Anonymous Source

0
Google’s 14,000 Search Ranking Features Leaked Through Anonymous Source

Rand Fishkin along with Mike King may have published one of the biggest data leaks outside of the Department of Justice reveal around Google Search and its internal ranking features and signals. The document was from an anonymous source but verified by Rand Fishkin and contains a ton of details on how Google Search reportedly works.

More importantly, it seems to contradict a number of the Google statements made over the past two decades from numerous Google Search employees, as I covered here over the past.

I have not gone through it all yet but I felt it was important for you all to read this yourself, you can see the details at these headlines:

Rand wrote, “Many of their claims directly contradict public statements made by Googlers over the years, in particular the company’s repeated denial that click-centric user signals are employed, denial that subdomains are considered separately in rankings, denials of a sandbox for newer websites, denials that a domain’s age is collected or considered, and more.”

Mike King wrote, “I have reviewed the API reference docs and contextualized them with some other previous Google leaks and the DOJ antitrust testimony. I’m combining that with the extensive patent and whitepaper research done for my upcoming book, The Science of SEO. While there is no detail about Google’s scoring functions in the documentation I’ve reviewed, there is a wealth of information about data stored for content, links, and user interactions. There are also varying degrees of descriptions (ranging from disappointingly sparse to surprisingly revealing) of the features being manipulated and stored. You’d be tempted to broadly call these “ranking factors,” but that would be imprecise.”

Aleyda Solis has a quick summary on X where she summed up part of the leak:

  • There are 14K ranking features and more in the docs
  • Google has a feature they compute called “siteAuthority”
  • Navboost has a specific module entirely focused on click signals representing users as voters and their clicks are stored as their votes
  • Google stores which result has the longest click during the session
  • Google has an attribute called hostAge that is used specifically “to sandbox fresh spam in serving time”
  • One of the modules related to page quality scores features a site-level measure of views from Chrome
  • I have not had time to go through everything yet, I will do that over the next several days.

    I have also not seen any Googler publicly comment on this yet – I know it is new and I don’t know if we will see any Googler comment on this.

    This reminds me a bit like the Yandex search ranking leak.

    Here are some posts on social about this – again, this has only been out for a few hours and no one but Rand and Mike had any real time to process this in super detail.

    Documentation related to the Google Search algorithm leaked and I spent the weekend tearing it apart.https://t.co/v71B16Ggov

    ✌🏾

    — Mic King (@iPullRank) May 28, 2024

    — Aleyda Solis 🕊️ (@aleyda) May 28, 2024

    Extremely interesting blog post by @iPullRank.
    Another one of the many he writes and we save for is usefulness ⬇️ https://t.co/VZH8EARV1G

    — Gianluca Fiorelli (@gfiorelli1) May 28, 2024

    Apparently someone at Google Search “accidentally” leaked an engineering document that reveals a ton of secrets about how the search engine works, including that they have a “Golden Document” flag which puts more weight on a document that is “Human labeled” which could mean some… pic.twitter.com/zeG79f161B

    — Joe Youngblood (@YoungbloodJoe) May 28, 2024

    If you want to geek out on this with me, I’ll keep updating this Google Doc for the next ~30 minutes with anything interesting before getting back to normal life.https://t.co/1iQ40nknZ0

    — Glen Allsopp 👾 (@ViperChill) May 28, 2024

    I am looking forward to really digging in on this.

    Forum discussion at X.

    FOLLOW US ON GOOGLE NEWS

     

    Read original article here

    Denial of responsibility! Search Engine Codex is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

    0
    9408 posts 0 comments
    You might also like More from author
    Leave A Reply

    Your email address will not be published.

    This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More