The Mysterious Google PageRank Algorithm

Even though there are many clues about the factors used by the Google PageRank algorithm it remains mysterious. They will never tell us exactly how it works to prevent manipulation. Besides, they calibrate their algorithm continuously based on experiments.

Citations (backlinks)

The number of citations in the form of backlinks is the most important factor. There is a subtlety here. Your webpage has to be cited by (linked from) highly ranked web pages to gain high visibility. The emphasis is on the word “page.” The PageRank algorithm is supposed to be page based, not domain based. Implication is that a link from a high-rank page on a low-rank website may be worth more than a link from a low-rank page on a high-rank website. Related to the citation factor, they could be using various indexes similar to the H-index used in academia.

Duplication is punished

If you placed your article in multiple web locations thinking that it increases survivability of your article for posterity you will pay a price in terms of discoverability by the Google search. I have examples of this. I placed copies of some of my WP articles on the Medium platform. The pageviews of these articles on the WP platform decreased dramatically and the pageviews on the Medium platform stayed at a minimal level. Since my articles get traffic mostly from the search engines I concluded that the visibility of my duplicated articles must have been reduced by the search engines.

Age of the webpage

Visibility of older webpages decrease. Newer webpages are given higher rank. It may be good idea to do minor edits on your older webpages.

Content

Ah…the magical word “content!” What can I say? Philosophy doesn’t sell. I know that much.

Make it easy for search engines to crawl your website

I have an index page that contains links to my articles. This makes it easy for the search engines to discover the content on my website. Internal links may increase the value of your website.

PageRank is not a recommendation algorithm

The YouTube recommendation algorithm gives a chance to newcomers by promoting them. The YouTube algorithm seems to have a random component in its logic by design. The Google PageRank algorithm, on the other hand, is not a recommendation engine. I don’t think there is any randomness in the Google PageRank algorithm.

Historical document

There is a document kept at Stanford University written by the founders of Google Sergey Brin and Lawrence Page.

The Anatomy of a Large-Scale Hypertextual Web Search Engine

In the conclusion of the original paper Brin and Page state that:

“Google is designed to be a scalable search engine. The primary goal is to provide high quality search results over a rapidly growing World Wide Web. Google employs a number of techniques to improve search quality including page rank, anchor text, and proximity information. Furthermore, Google is a complete architecture for gathering web pages, indexing them, and performing search queries over them.”

There were many changes and improvements since then I am sure.

This entry was posted in Uncategorized and tagged . Bookmark the permalink.