There is a lot of confusion out there surrounding how search engines like Google actually work. Some people think that the search engine looks at the individual words contained in a web page and ranks pages according to that. The truth is there is a lot more that goes into it.
I will try to give you a precise answer on how they do it, but I will actually fail miserably because answering such a question precisely would take volumes of text and require very elaborate mathematical formulas and algorithms. But, let’s give it a try. To start…
Search engines are basically just super-complicated databases that have been programmed with the ability to learn from their mistakes and improve their results as they go along. We use search engines by typing in keywords to describe what we are looking for and our engine of choice will search its database to find pages that contain those keywords.
Google makes small changes to the algorithm as often as 3 times every day, fixing little things here and there. Additionally, they roll out major core algorithm updates several times a year.
A simple way to think about it is to compare search engines with librarians – they both sort information and provide you with a list of resources according to what best suits your needs.
How do librarians work? They probably ask you what are the topics that interest you and then try to find books on those topics and put them in an order starting with the most relevant material first. The task of a librarian is relatively simple – it can be easily boiled down to:
1) Listen to the query
2) Retrieve a list of books on the topic
3) Sort the list and present it to the user
How do search engines work? They ask you what topics interest you and try to give you a list of pages that they know contain information on those topics. Sounds simple, but the devil is in the details. The key difference between the capabilities of search engines and librarians (besides having superhuman capabilities) is that while librarians can sort through books, search engines can sort through all possible web pages and present them in a manner that best matches your interests using information that is accumulated over time.
In order to serve up what it believes to be the most relevant information, search engines need to be able to scan through potentially billions of pages and sort them according to what they think is the most relevant *for you*. You and I may be able to enter the same search term and get different results depending on a multitude of factors. What factors affect search results? Well, let’s take a look.
First, the algorithm tries to understand the meaning behind your query. That is why it’s better to search for sentences rather than just a phrase. For example, if you search for “dog,” the results will likely just serve up information on what a dog is and not necessarily what you were looking for. However, the algorithm also tries to guess what you’re looking for and instead of just matching a keyword, it will look for other relevant media such as images, videos, etc. It will also provide suggestions to help further narrow your search.
On the other hand, if your search is “dog breeds,” you will see a list of articles on different dog breeds. The difference between the two searches is the meaning behind them, which poses a problem to search engines – they need to figure out what you mean by “dog” or “dog breeds.”
In order to determine that, search engines look at web pages and try to figure out what words on those pages are most relevant. Let’s take a step back for a second – so how do search engines know which words are more important than others? What you need to keep in mind is that search engines are indexing billions of web pages. In order to create an effective index, they have developed a set of algorithms and techniques to reduce the amount of data needed to store and quickly sort through it using minimal computing power.
Indexes don’t contain a lot of information – just a list of words, some identifying information on web pages, and a key metric called PageRank. The basics of search engines revolve around the way they measure how important a page is based on the number and quality of links that point to it. You can think of it as a popularity contest, but instead of voting for the contestant that you think is the most popular, a web page gets a vote based on how many people link to it, visit it, share it, etc.
The next thing an algorithm tries to determine is whether a page is relevant to your search. If the page doesn’t seem to contain helpful information related to what you’re looking for, it probably belongs at the bottom of your results. The algorithm has a number of ways to determine how relevant something is including:
It’s not uncommon to see some sites at the top of your results that aren’t necessarily relevant to your search. They’re often there because something on their site is related to a word in your query. It may not be a direct match, but search engines use a number of factors to determine how relevant something is.
However, algorithms try to maintain an objective view of subject matter, and as such, avoid the influence of subjective opinions. Algorithms are based almost entirely on math, which makes them very predictable and largely unbiased.
Quality matters not only for search engines but also for users. Sites that offer useful and entertaining information will get a higher ranking than those that don’t meet this threshold. The search engine algorithm will try to understand what makes a site valuable and informative by looking at:
These are just a few questions an algorithm might ask when assessing site quality, but you can imagine how complex this may become. In fact, because the algorithms are limited in their ability to understand the information in the same way a human does, Google and other search engines have humans who manually review sites, as well. In fact, these evaluators are highly trained and have a nearly 200-page search quality evaluator guidelines manual.
User experience (UX) refers to how easily a user can navigate your site, the aesthetics of the site, and how quickly they can find what they’re looking for. Of course, search engines have built their own algorithms to assess a site’s UX.
Google places a heavy emphasis on UX both in design and content. Some of the questions the algorithm seeks to answer are:
Finally, your browsing history plays a pivotal role in how the algorithm tries to predict the best results for you. Google is constantly trying to improve its ability to guess what you’re looking for.
If you’re logged into Google, they are able to tailor your search experience based on data they have about you including:
While this is helpful, it can also be cause for concern. Since search engines are using so much information about users to tailor results, it’s important for them to anonymize the data they collect. This way, an individual person’s browsing history doesn’t affect how they’re shown results. It also prevents someone you know (who may appear in your browsing history) from affecting the way you’re served results.
While search engine algorithms are constantly changing and becoming more sophisticated, there is still a lot of room for improvement. In fact, researchers believe that in the future we will be able to use AI technology to ask questions directly to our computers and get answers.
We already use natural language processing for queries when searching. BERT (Bidirectional Encoder Representations from Transformers) was developed by Google but will soon be replaced by MUM (Multitask Unified Model), currently the most powerful NLP in the world. This technology is able to process and analyze information more than 1,000 times faster than BERT. MUM will also help Google branch out from just being able to understand the text on a page to include analyzing text in images.
As technologies such as AI and quantum computing continue to evolve, search algorithms as we know them are going to evolve alongside them. In fact, Google has already started to integrate AI into their search algorithms by using it to predict what users are searching for when they’re typing queries in the search bar. The future certainly looks bright for search algorithms.
If you’re unsure of how well your site is performing in search results, it’s a good idea to connect with a Search Engine Optimization specialist who understands the ins and outs of search engines. I can help you understand which aspects of your site may not be working optimally when it comes to search, and how you can make changes to increase your rankings. To get started, contact me today!