1. PageRank: Weight and Quality
Key Insight
As I've proclaimed many times before, PageRank is of course still an active part of Google, so it comes as no surprise that PageRank is mentioned numerous times in the documents. PageRank is the popularity algorithm that made Google popular; it's the foundation of their entire search engine.
You can read much more about the PageRank concept in this article, where I go in-depth on the subject.
PagerankWeight
This parameter indicates the weight stored in linkmaps for PageRank. This weight affects the ranking of pages based on the quality and relevance of links. The higher the PageRank, the more influence the link has on a page's placement in search results.
Linkmap
A linkmap is a detailed mapping that stores various attributes and metrics for links between pages. In addition to PageRank weight, a linkmap also contains link attributes (e.g., nofollow), anchor texts, and much more. The linkmap is used to calculate the concrete PageRank score for a given page (strength). Not to be confused with Google's Link Graph, which is used to represent the overall structure of the internet's link patterns.
SourceType
Indicates the quality of the anchor's source page (the page that links). The sourceType attribute registers the quality of a link's source in relation to the tier the content is in. In short, this means that the higher the indexing tier a page has, the higher value links from that page are expected to have. At the same time, links are classified as either TYPE_HIGH_QUALITY, TYPE_MEDIUM_QUALITY, or TYPE_LOW_QUALITY, based on the link's indexing tier.
Quality and Traffic
Perhaps somewhat surprisingly, this is based on the number of clicks a given page receives from organic results. This means that a link from a website that receives many clicks from organic results has a greater effect than a link from a page with few or none.
The question is whether PageRank even flows through links that come from pages categorized as low quality? My bet is that it doesn't. This confirms the thesis that links from pages that themselves have traffic are better links than links from pages without traffic. Perhaps not particularly surprising, but still a topic that has been widely debated over the last 5-10 years.
2. Links in Fresh Content (TYPE_FRESHDOCS)
Fresh Content Matters
It's also mentioned that "TYPE_FRESHDOCS" is a special category and can be equated with TYPE_HIGH_QUALITY in terms of link importance. So links from fresh documents, i.e., newly published articles/pages, weigh more than links in older content.
While not directly documented, one can almost only imagine that the two elements combined provide the best foundation – i.e., a link from a page classified as high quality that has just published new content, where your link is found in the publication itself.
Strategic Takeaway
Focus on getting links placed in newly published content on high-quality websites. These links carry significantly more weight than links added to older, existing pages.
3. Internal vs. External Links (isLocal)
This bit indicates whether the anchor's source and target pages are on the same domain, i.e., whether it's an internal or external link.
Internal Links
Used for cohesive structure, PageRank value distribution, and user navigation within the same website.
External Links
A stronger indicator of a page's authority and credibility, as recommendations come from other websites.
4. Scoring and Ranking
AggregatedScore
This is a score aggregated from all sources, which likely includes various signals such as PageRank, relevance, timeliness, and other factors that affect the overall score of a link. This aggregated score provides a comprehensive picture of the link's value as a whole. The score will therefore differ depending on what is being linked to and on which website.
TopicalityWeight
The topical weighting assigned to each link is influenced by both the PageRank from the linking page and the relevance of the anchor text. This means that a link from a relevant page with high PageRank has greater value.
This also confirms that anchor text relevance affects weighting – so yes, anchor texts still have great significance.
5. Anchor and Hyperlink Data
Google uses data about anchor texts and links to uncover how links are structured and used in the content where the link is found. That is, where it's placed, what it stands near, whether it's relevant to the website being linked to, etc.
In the leaked information, we find these factors:
ByteEnd and ByteStart
Index for the last and first byte covered by the hyperlink. This information is used to identify the start and end of the link's placement in the overall content. This information is essential for delimiting the exact area that the hyperlink covers. This can be crucial for analyzing how the link affects the text's structure and readability.
Phrase
Index for the first and last token covered by the hyperlink. Tokens refer to words or word parts in the text, and by knowing the indices for these, one can get a detailed understanding of which specific words or phrases are connected via the hyperlink. This makes it possible to analyze the link's context in a more granular way.
URL
The absolute URL that the link points to. This is the full web address that specifies the destination for the hyperlink. Having the exact URL is crucial for being able to evaluate the link's target and its relevance to the content. The URL not only provides a destination but can also reveal information about the domain's authority, relevance to the topic, and contribute to understanding the overall link profile's value.
6. Homepage Trustworthiness (homePageInfo)
The homePageInfo attribute in the AnchorsAnchorSource module is crucial for assessing the value of a link based on the trustworthiness of its source, especially the homepage of the source website.
What is homePageInfo?
The attribute provides information about whether the source page for a link is a homepage and its level of trustworthiness. The possible values for homePageInfo are:
NOT_HOMEPAGE
The source page is not a homepage
NOT_TRUSTED
The homepage is not considered trustworthy
PARTIALLY_TRUSTED
The homepage has a moderate level of trustworthiness
FULLY_TRUSTED
The homepage is fully trustworthy
Role in Link Evaluation
Trustworthiness Evaluation
If the source page is the homepage, homePageInfo directly assigns a trustworthiness value (NOT_TRUSTED, PARTIALLY_TRUSTED, FULLY_TRUSTED).
Weighting Mechanism
- • Full trustworthiness: Links from fully trustworthy homepages likely receive higher weight
- • Partially trustworthy: Links from partially trustworthy homepages receive moderate weight
- • Not trustworthy: Links from untrustworthy homepages receive lower weight
SEO Implications
- Earning Links: Getting links from fully trustworthy websites, especially their homepages, can significantly increase a website's credibility and ranking in Google's search results.
- Source Trustworthiness: The trustworthiness of the source page's homepage is crucial for determining the overall quality and weight of its outgoing links.
- Impact on Target Page: While homePageInfo concerns the link's source, it ultimately affects the target page's link profile and perceived authority based on the trustworthiness inherited from the source's homepage.
7. Outgoing Link Data
Outgoing Links
Contains data about outgoing hreflang links that appear in the document being processed. These links can affect the page's visibility in different regional search results.
ByteLength
The length in bytes of the link text, which can be relevant for assessing the link's prominence in the content.
ByteOffset
Byte offset for the start of a link in the annotated document's content, which is important for precisely locating the link.
IsNofollow
Indicates whether the link is a nofollow link, meaning it should not pass PageRank.
8. Quality and Locality of Links
Locality
For ranking purposes, the quality of an anchor is measured by its "locality" and "bucket". This means that the link's value can vary depending on its placement and context.
PagerankWeight
Weight to be stored in linkmaps for the PageRanker, which affects the page's overall ranking.
ParallelLinks
The number of additional links from the same source page to the same target domain, which can indicate the link's significance.
9. Link Spam and Assessment
Link Spam Detection
The IndexingDocjoinerAnchorSpamInfo system evaluates and demotes spammy anchor texts. A model that analyzes anchor texts to identify potential spam links, so they can be filtered out in the calculation of pages' value measured in links.
The system keeps track of when links are discovered by recording the time frame from the first to the last link in the document. It compares how many of these links appear spammy with the total number of links and calculates a ratio that determines whether links should be demoted.
Classification and Link Effect Adjustment
When IndexingDocjoinerAnchorSpamInfo detects that a high proportion of anchor texts contain spam phrases, it can result in a demotion of these links. The demotion means that links from pages deemed spammy lose their significance in Google's search results.
This can happen for either all links in a certain period or only for links directly identified as spam. The model also uses a calculated value that indicates how likely it is that a link is spam.
Trustworthy Sources Can Still Be Demoted
Furthermore, the system accounts for trustworthy sources. Even if the link comes from a trustworthy source, it is still evaluated on its own text. The system also analyzes examples of links from trustworthy sources and uses this information to improve accuracy in identifying spammy links.
Example Scenario
Let's imagine a situation where an otherwise trustworthy page contains some links to low-quality or spammy websites. IndexingDocjoinerAnchorSpamInfo will evaluate the link texts to determine if they contain spammy phrases – that is, words or expressions often associated with spam.
Even though the page normally doesn't link spam and generally has a good reputation, the system can still demote the outgoing links if they're deemed spammy, especially if they're assessed to also point to bad or suspicious websites.
Critical Point: You could buy links from a very trustworthy website, but if your own website isn't trustworthy, and if the anchor text seems spammy, then the link has zero value for your website. Conversely, on the same page there could be a link to another page that does provide value – as long as the recipient website and anchor text aren't spammy, value will be transferred, otherwise not.
Summary
The Google leak confirms that PageRank is still a completely central part of Google's algorithm, where links with high PageRank have great influence on a page's ranking. At the same time, it documents that links from traffic-rich pages that receive many clicks from organic search results have greater value. Source quality is crucial: links from high-quality pages weigh heavier than links from low-quality pages. Furthermore, fresh links from newly published articles have more weight than links from older content.
Google uses linkmaps to assess PageRank, where details such as link weight, link attributes, and anchor texts are included. Internal links support navigation and PageRank distribution internally, while external links contribute to a website's authority and credibility. Links are evaluated based on an aggregate score that includes PageRank, relevance, and topicality, where the topical weighting of links depends on both PageRank and the relevance of the anchor text.
Google also analyzes anchor texts and hyperlink data to understand their context. Link spam is identified and demoted, which can reduce the value of links from trustworthy sources if they're deemed "spammy".
Key Takeaway
Overall, the leak emphasizes the importance of getting links from relevant, traffic-rich, and trustworthy sources, as well as ensuring that anchor texts and link context are of high quality and not manipulative in any way.
Want to Learn More About Link Building?
Now that you understand how Google evaluates links, learn more about practical link building strategies and best practices in my comprehensive guide.
Read the Link Building Guide