Google Shows Us Context is King When Indexing People

Posted @ Dec 11 2017 by

Context is King

Sometimes, you’ll see someone writing about SEO say that “Content is King”. But when I see that, I’m often tempted to respond with, “Context is king.”

The right words at the right time meeting the intent of someone searching can fulfill their informational or even situational needs. When you write a page, you should include words that can help to show off the context of keywords that you may be focusing on during it’s creation.

More Patents about Context

Recently, I’ve been noticing patents coming from Google focusing on context. One of the first was a patent about context vectors. I wrote about these context vectors in the post: Google Patents Context Vectors to Improve Search.

In the post, Does Tomorrow Deliver Topical Search Results at Google?, I wrote about a Google patent that told us search engines may start showing breadcrumbs before every search result which might show off the meaning of a query term that may have more than one meaning.

Having noticed how Google is paying more attention to context in indexing pages and in displaying pages in search results, the word “context” in a very recent patent from Google caught my attention. Especially when I saw “Name disambiguation using context terms,” which I am writing about today.

Consider a common name such as John Smith. There is a well known John Smith from England who is known for discovering the colony of Jamestown, the first permanent English settlement in North America. There is another John Smith who was well known as a botanist. This patent tells us that it might identify some context vocabulary terms that could be associated with each John Smith. So, if someone searches for John Smith and Kew Gardens, they are likely looking for the botanist.  Another person searching for John Smith and Virginia is likely looking for the explorer.

This patent identifies the problem that it aims to solve with similar examples:

A very popular search scenario is searching on person names. As most person names are not unique, an initial search on a person name can yield multiple search results that each reference resources describing different persons. For example, a search on the name of “John Smith” may yield search results that reference resources with information about an explorer, resources about a botanist and curator of Kew Gardens, resources about a professional wrestler, and still other resources about other people that are named “John Smith.” As search queries are often an incomplete expression of the information needed, the user will often revise the search query to focus in on search results. Such revisions including adding addition search terms to the name. For example, suppose the user is searching for information relating to the explorer John Smith’s interactions with Chief Powhatan. The user may revise the query to read “John Smith Chief Powhatan.” The search query will cause a search engine to provide search results that reference documents that are more likely to satisfy the user’s informational needs.

When Wikipedia has an entry about more than one person with the same name, it contains “disambiguation” information. In addition to the person’s name, it may contain other terms to identify which person is being referred to. In the case of the John Smiths, I am writing about, that may be:

John Smith (botanist)
John Smith (explorer)

Those representative terms may be taken from search suggestions:

The query terms and representative terms may look like disambiguation terms found in Wikipedia:

Wikipedia Disambiguation suggestions

Not mentioned in the patent is that we may see knowledge panels that identify each of these different John Smiths:

The John Smith (explorer):

The John Smith (botanist):

How does Google keep track of the different John Smiths?

1. Context term lists are created for person’s names:

…each context term list being a list of context terms from a resource for the person name, and each of the resources to which the context term lists for the person name correspond being different resources; clustering the context term lists into a plurality of clusters, each of the clusters of context term lists including context term lists that are most similar to the cluster relative to other clusters; for each of the clusters, selecting a representative term for the cluster; receiving the person name as a search query; and generating a plurality of query suggestions from the search query and the representative terms for the clusters, each query suggesting being a combination of the person name and one representative term.

2. Advantages of this context approach, according to the patent:

Users are provided query suggestions for person names, and each suggestion is representative of a context associated with the name. Each context is used to disambiguate the name, and thus the user can quickly focus a search on an appropriate context without having to manually determine the various contexts. Person names that would otherwise have a dominant interpretation (e.g., names of famous people or historical figures) are disambiguated among contexts, and the dominant interpretation is limited to a proper subset of the contexts. Accordingly, the system can provide query suggestions for the dominant interpretation and multiple other contexts that are not associated with the dominant interpretation.

3. Resources that refer to the same person (i.e., a name disambiguated among contexts) may be separately clustered.

Other terms that may have more than one meaning, like Jaguar, may have results separately clustered. For instance, the Jaguar Car, the Jaguar cat, and the Jacksonville Jaguar NFL football players.

The Context Patent

The patent about Context and names can be found at:

Name disambiguation using context terms
Inventors: Nitin Gupta and Abhinandan S. Das
Assignee: Google Inc.
US Patent: 9,830,379
Granted: November 28, 2017
Filed: November 29, 2010

Abstract

Methods, systems and apparatus, including computer programs encoded on a computer storage medium, for disambiguating names in a document corpus. In an aspect, a method includes generating context term lists for a person name, each context term list being a list of context terms from a resource for the person name; clustering the context term lists into a plurality of clusters, each of the clusters of context term lists including context term lists that are most similar to the cluster relative to other clusters; for each of the clusters, selecting a representative term for the cluster; receiving the person name as a search query; and generating a plurality of query suggestions from the search query and the representative terms for the clusters, each query suggesting being a combination of the person name and one representative term.

Take-Aways

Much like the patent about context vectors describes, knowledge bases may be helpful in finding information about potentially unique people. For instance, Michael Jackson was a well known Popular Singer. There was also a Michael Jackson who was an Administrator of the Department of Homeland Security. We can learn this from the Michael Jackson (disambiguation)Page Just like when can look up the different types of horses for equestrians, for carpenters, and for gymnasts. It’s possible to take terms that help define context for each of those meanings and persons.

This disambiguation patent tells us the queries used to find pages can show us different contexts associated with people’s names. If you are optimizing a page for a particular person, it makes sense to include context terms that help define who you’re writing about.

If it is possible to use terms that help define context for a person, including those terms on pages you create about them can help in indexing the right person by paying attention to context.

1 Comment

  1. Pingback: SearchCap: Google's SEO guide, Microsoft outings app & data SEO

Leave a Comment