How to Study the Ranking Factors of a Search Engine

Why You Can't Trust Correlation Studies of Ranking Factors by SEO Experts?

Studies performed by SEO experts to determine the validity of search engine ranking factors are mainly based on correlation.

Correlation Studies determine the relation between two variables. In our case, one variable is the "ranking factor" and the other is the "position" of web pages in search results.

A positive correlation indicates that both variables increase and decrease at the same time. SEO experts, who value correlation studies, assume that the PRESENCE of a "ranking factor" in multiple pages is responsible for the HIGH "position" of those pages in search results. Likewise, ABSENCE is responsible for a LOW "position".

However, even a positive correlation is NOT SUFFICIENT PROOF because "correlation does not imply causation". It's impossible to determine the validity of a ranking factor in this manner, purely because of the large number of factors involved, and the difficulty in isolating individual factors without conducting a proper SEO experiment.

Correlation vs Causation

Causation Study or Causality is the study of the relation between the cause and effect. Causation study is a better method to determine how search engines rank results. It is very important to know the real cause or reason behind the effect. A positive correlation is better used as the starting point of a causation study. But, only a "cause and effect" analysis shows why a search engine alters the position of a web page in its results.

So, how do you test a positive correlation using a "cause and effect" method? Follow these steps. Update a web page with the factor that a positive correlation suggests will produce a higher position in the search results. If the position improves, then positive correlation also requires that removal of the ranking factor should result in a decline in the pages' position in the search results. Test this repeatedly for one page and then for multiple pages.

Examples of two popular correlation studies include the SEOmoz list and the Searchmetrics list.

SEOmoz's ranking signals are based on the opinion of several SEO industry experts after analyzing large scale data using correlation. The Searchmetrics list is the consequence of another correlation study based on Google's search results. But, as explained above, before accepting the validity of these lists, you need to apply the principle of causation to them.

Why is it Difficult to Determine Search Engine Ranking Factors?

There are genuine reasons, which hamper the detection of the exact set of search engine ranking factors.

First, search engines frequently update their ranking algorithms. So, the combination of ranking factors and corresponding weights (ranking points) are constantly changing. Google, for example, uses over 200 different ranking signals. What works today, for acquiring a top position in Google, is a lot different from what worked the previous year.
Google algorithm updates like Panda and Penguin are responsible for frequent changes that many SEO experts are still finding hard to grasp. Panda is a "document classification" algorithm. It classifies web documents into "high quality" or "low quality". The Panda algorithm uses web documents as input for updating its own ranking criteria via "machine learning". Penguin is a spam detection algorithm. It detects signals for keyword spam and link spam. Both Panda and Penguin are constantly adding or modifying ranking signals, thereby making it difficult to pinpoint ranking factors.

Second, whether you use a correlation study or causation study, there is a high probability that an unknown variable is responsible for altering the position in search results rather than the assumed variable. There are three reasons, which can cause an error in judgement. One reason is that the sheer high number of signals (200+ in Google's case) can cause uncertainty. The other two reasons are related to the dynamic search environment. The search algorithms organize results by using the entire web document corpus, which changes every second. In addition, the search queries made by Google users also influence search results. And, the search query corpus too changes every second. So, position of a web page in search results can vary purely because of the frequent updates to the web document corpus and search query corpus.

Finally, the signals that Google uses, are NOT revealed to anybody. Only a few Google employees responsible for maintaining the quality of search results are aware of these signals. Every SEO expert detects these signals first by making a guess. So, you need to verify properly, when an SEO expert claims to know any or all of these signals and promises to deliver high ranks for targeted keywords.

Does this mean you can't determine at least a subset of the whole set of ranking factors correctly? Yes, you can, by analyzing search results and experimenting with isolated web pages. But, it's important to remember that new factors are constantly being introduced in the mix and their weights are being constantly updated by search engines.

Organize by Keyword Relevance, Web Page Quality, and Popularity

To improve search engine rankings, it's a good idea to organize ranking factors under the following categories of a web page.

Keyword Relevance: Search queries, called keywords, determine relevancy. Keywords used in the published content determine the topic of a web page. The entire content on a web page should be relevant to one main topic.
A web page is usually relevant to multiple topics because of all the words used in it. However, the degree of relevance varies for each topic. The degree of relevance to a particular topic can be improved by emphasizing and repeating keywords related to the topic.
SEO experts generally try to restrict relevancy of a web page to a single topic. Some do it wrongly by mainly stuffing the page content with one or two keywords. This is one type of keyword stuffing. Writing only about a single topic is not mandatory for SEO. The only reason to avoid including multiple topics on a web page is to provide a good user experience.

Web Page Quality: How you organize and present content on a web page primarily determines its quality. Individual page layout and website navigation structure are very important in calculating the SEO Quality Score of a web page.
Other quality checks, like site uptime and performance, include download time of individual web pages and overall site speed. But, unless the page speed of your entire website is too slow, site speed can't be considered as an important ranking factor.
Preparing a checklist of good web design principles to match with search quality rating guidelines is the best way to improve web page quality.

Popularity: The external links pointing to a website determine its popularity. This is termed as link popularity.
Brand popularity also affects ranking. You can enhance brand popularity by using a website's brand name widely on the web.
Enhancing brand popularity also increases brand name searches, which improves search engine rankings to a great extent.

The Correct Way to Study Search Engine Ranking Factors

So, what are the methods to determine the relevance, quality and popularity aspects of a web page?

The best method is to analyze the top 10 results in the SERPs of thousands of search queries.

First, find matching patterns in the high ranking pages for keyword relevancy and high web page quality. Then, conduct separate SEO experiments in a controlled environment, isolating each of the matching patterns, and testing them on separate web pages.

Finally, confirm that each matching pattern is a ranking factor (cause) in the search algorithm, if the rank improves (effect) each time the SEO experiment is repeated. This is the only way you can get close to determining a factor which affects search engine rankings.