Has Google gotten better at organic search results involving location queries?
I first started looking at patents to help me with a site where the location of the business behind the site was very important to the business they conducted in organic search results. The patent I wrote about then was titled Assigning geographic location identifiers to web pages. Google was recently granted a patent that makes indications of locations in prominent places on pages of a site very important. I am reminded of that older patent with this more recent one about geographic location queries.
This new patent provides an example of potential problems that can take place with searches that are focused upon geographic locations:
Some users that provide a search query are interested in receiving search results referencing resources that include information relevant to a particular location. For example, a user that submits the search query “Atlanta Family Activities” may be searching for web pages that provide information about the city of Atlanta. Search results provided in response to the search query “Atlanta Family Activities” may include a web page that does not provide information about family activities in Atlanta, or even the city of Atlanta, but rather merely includes the word Atlanta.
For example, one resource referenced by the search results may be a retailer site that includes a drop-down menu enabling the user to specify their current location to identify retail locations near the selected location. Another resource may include the word Atlanta in a footnote of the resource that specifies a business location of the company that developed the web page. Although both of the resources described above include the text “Atlanta,” it is unlikely that these resources would satisfy the informational needs of the user that submitted the search query for “Atlanta Family Activities” because these resources provide very little information about family activities in the city of Atlanta.
This new patent introduces something they refer to as “Semantic Geotokens.” A semantic geotoken is “a standardized representation for the geographic location including one or more location-specific terms for the geographic location.”
Does this geotoken provide enough confidence that it is about a specific place? That confidence level may be based in part upon the location of a mention of a place in a prominent place on a page.
We are also told that this approach would involve making sure we are given enough specificity about a location on a page, such as a city and a state for a geographic location being specified, so a search for something happening in Atlanta, should be on a page that tells us that it is the Atlanta in the state of Georgia.
The patent description tells us about the following advantages to be gained in following the patent:
(1) Using Semantic Geotokens in indexing pages can help provide more relevant search results.
(2) The times to return search results taken by the search engine are decreased with the use of relevant Semantic Geotokens.
(3) Result scores for search results tend to be better and more precise because of an increased confidence involving results that involve a geographic location that is referenced by a location phrase in a search query.
This recently granted patent is:
Inventors: Daniel Francis Lieuwen, Andrew William Hogue, Greg Morris, and Denis M. Lynch
Assignee: GOOGLE INC.
US Patent 9,582,548
Granted: February 28, 2017
Filed: December 29, 2014
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing geographically relevant search results. In one aspect, a method includes receiving a geotoken for a resource. The geotoken can be a resource token that references a geographic location. A semantic geotoken can be selected using the received geotoken. The semantic geotoken is a standardized representation of the geographic location that includes one or more location-specific terms. The semantic geotoken is stored with a reference to the resource. Neighboring locations for the geographic location are determined. The neighboring locations are within a predetermined distance of the geographic location. Semantic geotokens for the neighboring locations are selected and stored with the reference to the resource. Data specifying the semantic geotokens and the reference to the resource are provided.
Referencing Geographic Locations
Pages on the Web may reference geographic locations in many ways:
(1) Information about a location at which a particular activity or business is located,
(2) A list of geographic locations from which a user can select their particular geographic location.
(3) Geotokens that associate the resource with the geographic location.
Ways in which locations may be referenced:
(1) A listing of the postal office address of its location (i.e., an exact street address in a city, such as Atlanta).
(2) A listing of local activities for a city that references only the city name (e.g., Atlanta) without exact addresses for the listed activities.
(3) Different Geotokens that refer to a geographic location, such as a Query that refers to a city, or a city and a state, or a Zipcode, such as “Family Activities (30309 “or” Atlanta or Atlanta, Ga.).” (the more “or” statements in such a geotoken, the longer a search will take.
(4) Geotokens that refer to neighboring locations can also be used in searches, such as a search for Atlanta that also includes (“or”) searches in Decatur, Ga.
This seems to be a broader way to search for queries involving locations than the method described in the “geographic location identifiers” patent I linked to at the start of this post – Google may have improved how geographic searches are done in the decade between the two patents.
Generating Semantic Geotokens
A search engine may generate semantic geotokens to identify resources relevant to a location phrase in a search query. We are told a few things about geotokens, such as:
The semantic geotoken apparatus is a data processing apparatus including one or more processors that are configured to generate semantic geotokens for resources based on one or more references to geographic locations that are associated with the resource (e.g., text identifying a geographic location). A geographic reference is associated with a resource by being included in the resource and/or being included in a reference (e.g., an active link) to the resource. These geographic references are referred to as geotokens.
On-Page Geographic Relevance Scores
The “on page” geographic relevance score may be based on the numbers and locations and specificity of geotokens (e.g., words, phrases, meta information, images, audio, or other information specifying a geographic location) used on a page. A web page that using the text “Atlanta, Ga.” as a title or main heading for the web page can have a higher on-page geographic relevance score for the location Atlanta than one using the text “Atlanta” in a dropdown menu or other “boilerplate” content.
The patent defines this in more detail:
A geographic relevance score is a value specifying a likelihood that a resource is relevant to a geographic location that is referenced by a geotoken. As described in more detail below, the semantic geotoken apparatus determines an “on-page” geographic relevance score for a resource. An “on page” geographic relevance score is a value specifying a measure of geographic relevance for a resource based on the geotokens that are included in the resource itself.
Off Page Geographic Relevance Scores
These “off-page” geographic relevance scores are from geotokens included in references to a resource, such as active links that link to a web page and that may include the anchor text “Atlanta” (or other references to Atlanta, such as zip codes for Atlanta) to determine an off-page geographic relevance score for the resource relative to the location Atlanta. It makes sense that Google would be looking at both on-page and off-page signals to decide if a page is about a particular location, and probably shouldn’t come as a surprise that they are doing that.
Confidence Scores for Geographic Relevance
The combination of on-page and off-page signals that a page is about a particular location combine to meet a confidence score that may indicate that a resource is about that location.
Storing Semantic Geotokens in Google’s Seach Index
The patent tells us that these semantic geotokens may be scored in the search index. When you search for “Pizza in Carlsbad” Google can identify quickly all the pages that are likely located in Carlsbad, and then find the ones that are about Pizza.
Geographical Relevance Scores for Geotokens
The patent provides some more hints about how powerful some geotokens are, such as:
(1) A geotoken that indicates a location in a page title is more powerful than a geotoken that indicates a location in a footnote on a page.
(2) A geotoken that refers to just one location in a title on a page has more weight than a title that refers to more than one location in a title.
(3) A location that is referred to on a page earlier than other locations may be considered a leading geotoken and it may be a strong indication of which location is most relevant to a page.
(4) A geotoken in a tag on a page may rank higher than if that location was just in the content on the page.
(5) Geotokens in Boilerplate may be ignored or assigned a lower weight than other geotokens.
(6) Qualified geotokens may be merged and assigned higher weights than other geotokens, such as geotokens for “Cleveland” and “Ohio” appearing on the same page – which may be combined to a “Cleveland, Ohio” geotoken.
(7) Offpage geotokens can be combined with on page geotokens, like a link using the anchor text “Cleveland” Pointing to a page with the title “Ohio”.
(8) The geographic relevance scores for geotokens that are more precise, like a full postal street address, are higher than the geographic relevance scores for less precise geotokens such as just a city name.
(9) Geotokens that are things like Street Addresses are considered “high precision geotokens.”
(10) Confidence scores for neighboring locations may be lowered but included in searches for a specific location, like in a search for “Pizza in Somerville, NJ,” which might get results shown for Raritan NJ, which is 10 kilometers away. The lower confidence score means a lower search result ranking.