Updated: Inferring Geographic Entity Locations in Queries

Get The Top News In Search

This field is for validation purposes and should be left unchanged.

Where Do Entity Locations Appear in Queries?

Some search queries submitted by searchers refer to physical locations. The search engine can quickly provide information about entity locations, which may be in addition to the responsive search results. For example, in response to “Eiffel Tower Paris France,” the search engine could provide information about Paris in addition to the search results about the Eiffel Tower. But, the query “Eiffel Tower” does not specify a physical location.

Inferring Searcher’s Address By IP Address

One way to infer unspecified locations is to identify locations that are geographically near the searcher. In some instances, the geographic location of a searcher can get determined by the searcher’s IP address. A significant weakness in this approach is that it does not provide location information when a searcher seeks information about a distant location—for example, a searcher in Los Angeles seeking information about the Eiffel Tower.

Inferring Searcher’s Address by Lookup Table

Another way to infer unspecified or “missing” locations in a search query is to use a lookup table. When a lookup term in the table matches a term in a query, the corresponding location gets presumed. A drawback to this method is that it does not provide a notion of the prominence of the search entities when there are multiple entities with the same or similar names. Another drawback is that the locations that may get inferred get limited to locations associated with preselected entities in the lookup table.

Deja Vue All Over Again

I was almost finished writing this post when I discovered that I had written about this patent before almost a decade ago in the post: How Google Might Use Query Logs to Find Entity Locations The patent has changed: this one is a continuation patent with new claims, and it is worth comparing the old ones and the new ones to see how different they are, and what has changed from them. After the patent and a summary of it. I compare the first claims from both versions.

Assigning Entity Locations with Websites

inferring entity locations

Deficiencies and problems with assigning physical entity locations in a query get overcome with queries submitted previously and by capitalizing previously issued queries that refer to physical locations. Information from queries that explicitly or implicitly specify locations can help identify locations for other queries that do not specify locations. The same principle gets applied to websites to infer a physical location for an entity associated with the website.

Computing a location-specific score that represents the likelihood that the respective location gets associated with a website.

The method consists of:

  • Computing a site confidence value representing a likelihood that the website gets associated with a physical location
  • Determining a location associated with the website using the location-specific scores and the site confidence value
  • Storing information indicating that the determined location gets associated with the website for subsequent use when processing respective search queries

A non-transitory computer storage medium stores programs that get executed by the processors of a computer system. The programs include instructions, that when executed by the processors of the computer system, perform the method of associating the locations with a website.

Associating the locations with a query from a client device gets performed by a server system. The device includes the processors and memory storing the programs for execution by the processors. The method includes:

A method of associating the locations with a query gets performed by a server system, which includes the processors and memory storing programs for execution by the processors. The method comprises identifying a query, selecting a set of documents responsive to the specified query, and assigning weights to individual records in the collection of papers based, at least in part, on historical data of searcher clicks selecting search result links in search results produced for historical queries the same as the identified query.

The method further includes identifying respective websites hosting the documents in the set of documents. Each website retrieves location-specific information for the retrieved data for each location with a location-specific score. That score corresponds to the likelihood that the respective location goes to an individual website of the identified websites.

The method also includes, for each location for which location-specific information gets retrieved, totalling the location-specific scores, as weighted by the document weights, to compute a likelihood that the separate location gets associated with the query; and assigning a specific location to the query when criteria get satisfied, the predefined criteria comprising a need that the aggregated likelihood for the particular location exceeds a first predefined value.

A non-transitory computer storage medium stores the programs to get executed by processors of a computer system. The programs include instructions, that when executed by the computer system, perform the associating locations with a query.

Thus methods, systems, and computer-readable storage media get provided that infer a physical location for a website or a search query, using information from previously issued search queries.

Search engines can then provide information related to the appropriate physical location to the searcher, creating an enhanced, more efficient searcher interaction with the search engine.

Inferring geographic locations for entities appearing in search queries
Inventors: Sushrut Suresh Karanjkar, Viswanath Subramanian, and Shashidhar Anil Thakur
Assignee: Google LLC
US Patent: 11,176,181
Granted: November 16, 2021
Filed: October 7, 2019

Abstract

A server system associates locations with a query by identifying the query, selecting a set of documents responsive to the query, and assigning weights to individual records in the group of documents based, at least in part, on historical data of searcher clicks selecting search result links in search results produced for historical queries substantially the same as the identified query. Websites hosting the selected documents get remembered. For each website, location-specific information for the locations gets retrieved, including a location-specific score that corresponds to the likelihood that the respective location corresponds to an individual website.

For each respective location for which location-specific information gets retrieved, aggregating the location-specific scores, as weighted by the document weights, to compute an aggregated likelihood that the individual location gets associated with the query.

A specific location gets assigned to the query when predefined criteria get satisfied.

Comparing the Claims from the Different Versions of the Entity Locations Patents

From the 2012 patent: Inferring Geographic Locations for Entities Appearing in Search Queries We see a relatively short first claim:

1. A method for inferring locations associated with a website, performed by a server system having one or more processors and memory storing one or more programs for execution by the one or more processors, the method comprising: at the server system: identifying a website; for each respective location of a plurality of locations referenced in queries with respective result sets that comprise search result links to documents hosted at the website, computing a location-specific score representing the likelihood that the respective location is associated with the website; computing a site confidence value representing a likelihood that the website is associated with a physical location; determining a location associated with the website using the location-specific scores and the site confidence value; storing information indicating that the determined location is associated with the website for subsequent use when processing respective search queries.

From the 2021 patent: Inferring geographic locations for entities appearing in search queries, here is a much longer claim that mentions using historical data and user click information:

1. A computer-implemented method, comprising:

Receiving, by a computing system comprising one or more computing devices, a search query;

Identifying, by the computing system, a set of documents that are responsive to the search query;

Assigning, by the computing system, a respective weight to each respective document of the set of documents based, at least in part, on historical data, wherein the historical data is indicative of a number of user clicks on one or more search result links corresponding to one or more historical documents produced for historical queries that are substantially the same as the search query;

Identifying, by the computing system, a plurality of websites comprising a respective website for each respective document of the set of documents;

Identifying, by the computing system, one or more location scores associated with each respective document of the set of documents based on each respective website of the plurality of websites, wherein each respective location score of the one or more location scores is indicative of a likelihood that a respective geographic location is associated with the respective document, wherein the one or more location scores comprise one or more location-specific scores for each respective website of the plurality of websites, wherein each respective location-specific score of the one or more location-specific scores corresponds to a respective geographic location, wherein the respective location-specific score is predetermined based, at least in part, on a site click count indicative of a number of user clicks on one or more respective search results of the one or more search results produced for historical queries that are substantially the same as the search query, wherein the one or more respective search results are associated with one or more documents hosted at the respective web site;

Determining, by the computing system, a query score for each respective document of the set of documents based, at least in part, on the respective weight and the one or more location scores associated with the respective document;

Determining, by the computing system, a geographic location associated with the search query based, at least in part, on the query score for each respective document of the set of documents; and providing for display, by the computing system, a subset of the set of documents and information associated with the geographic location.

Entity Locations Patents Conclusions

The much longer more modern patent claim shows that Google is looking at more data, including historical data and user click data. Google has evolved how they find entity locations for queries. This is what I expect to see ten years later, and is what we are getting.

Search News Straight To Your Inbox

This field is for validation purposes and should be left unchanged.

*Required