The past decade has marked the shift of SEO from spreadsheet-driven, anecdotal best practices to a more data-driven approach, evidenced by the greater numbers of SEO pros learning Python.
As Google’s updates increase in number (11 in 2023), SEO professionals are recognizing the need to take a more data-driven approach to SEO, and internal link structures for site architectures are no exception.
In a previous article, I outlined how internal linking could be more data-driven, providing Python code on how to evaluate the site architecture statistically.
Beyond Python, data science can help SEO professionals more effectively uncover hidden patterns and key insights to help signal to search engines the priority of content within a website.
Data science is the intersection of coding, math, and domain knowledge, where the domain, in our case, is SEO.
So while math and coding (invariably in Python) are important, SEO is by no means diminished in its importance, as asking the right questions of the data and having the instinctive feel of whether the numbers “look right” are incredibly important.
Align Site Architecture To Support Underlinked Content
Many sites are built like a Christmas tree, with the home page at the very top (being the most important) and other pages in descending order of importance in subsequent levels.
For the SEO scientists among you, you’ll want to know what the distribution of links is from different views. This can be visualized using the Python code from the previous article in several ways, including:
- Site depth.
- Content type.
- Internal Page Rank.
- Conversion Value/Revenue.
The boxplot effectively shows how many links are “normal” for a given website at different site levels. The blue boxes represent the interquartile range (i.e., the 25th and 75th quantiles) which is where most (67% to be precise) of the number of inbound internal links lie.