Sentiment as a Ranking Signal for Entities

Posted in: SEO

May 20, 2016

Join thousands of marketers to get the best search news in under 5 minutes. Get resources, tips and more with The Splash newsletter:

Table of Contents

Sentiment Phrases

Back before there were Google+ pages for businesses, Google had Place Pages for businesses, and Google would show off sentiment-rich snippets from reviews of those places, which I wrote about in Google’s New Sentiment Phrase Snippets For Google Places.

Related Content:

I was discussing sentiment rich reviews recently with Barbara Starr, who pointed out some to me in Google MyBusiness pages for businesses, and how some of those sentiment statements came from the same review even though those places may have had additional reviews. It seemed the amount of sentiment made a difference as to whether those statements appeared or not. What inspired that and what impact might it have?

Sentiment for Ranking

With that question about the importance of sentiment in reviews in mind, I happened to come across a Google patent granted last month that talked about a role for sentiment in ranking reviews. The patent discussed how it might use ratings to rank entities that people provided reviews of, and how basing the ranking on sentiment exposed in reviews might avoid issues such as averaging of ratings, and how often people would like to see reviews of entities:

Attempts to use sentiment as a ranking signal for search results have commonly used structured reviews. In structured reviews, the reviewer selects a rating in addition to providing a textual review of the entity. Structured reviews can be conveniently used in ranking systems as most structured reviews use a numeric rating (e.g. a 5 star system or a scale of 1 to 10) that can easily be used to rank results. Results are ranked by their average numeric rating from the structured review. However, in instances where an entity has mixed reviews valuable information may be lost due to the averaging.

The patent also tells us that if they just use a numerical rating, that important information from an actual review that provides more detail could be lost, such as “food great, service bad”.

It also points out that reviews from outside of a review and rating system that express sentiment about some reviewable entity could be incorporated into a ranking system that uses sentiment to review an entity as well. Blogs and personal web pages are two sources that could be used (and having said that, I would give this patent high marks for pointing those sources out!)

The patent doesn’t use ratings from reviews to influence rankings, though it does say that people are likely interested in seeing information about entities that have been rated highly. Here’s how the patent tells us it works instead:

Text providing reviews of entities is identified.
Sentiment scores are assigned to that review text.
Ranking scores for the entities are generated based, at least in part, on the sentiment scores.

It’s possible that these ranking scores may go into scores for places that might appear in Google Maps results, but would only be part of the things that influence ranking for places. Google’s help pages tell us that they rank places based upon relevance, distance, and prominence. So, sentiment might be a factor that could be included with those as well.

The patent is:

Sentiment detection as a ranking signal for reviewable entities
Inventors: Sasha Blair-Goldensohn, Kerry Hannan, Ryan McDonald, Tyler Neylon, and Jeffrey C. Reynar
Assignee: Google
US Patent 9,317,559
Granted: April 19, 2016
Filed: April 8, 2013

Abstract

A method, a system, and a computer program product for ranking reviewable entities based on sentiment expressed about the entities. A plurality of review texts is identified wherein each review text references an entity. A plurality of sentiment scores associated with the plurality of review texts are generated, wherein each sentiment score for a review text indicates a sentiment directed to the entity referenced by the review text. A plurality of ranking scores for the plurality of entities is generated wherein each ranking score is based at least in part on one or more sentiment scores associated with one or more review texts referencing the entity. A plurality of search results associated with the plurality of entities are displayed based at least in part on the ranking scores.

Take Aways

This patent covers more than just places; it includes all kinds of entities that might be reviewed by people, and the patent provides this set of examples: “restaurants, hotels, consumer products such as electronics, films, books, and live performances.”

The patent tells us about an entity review data repository, which might contain both structured reviews, and unstructured reviews. Structured reviews tend to follow a format that includes things like ratings and review text, and they tell us would possibly come from sites such as “Google Maps, TripAdvisor, Citysearch or Yelp.” These structured reviews may also come from places such as books, newspapers, and magazines. Unstructured reviews are textual documents which tend to contain opinions about reviewable entities but don’t have the formatting that structured reviews do. They might come from places such as “blogs, emails, newsgroup postings.”

The patent also mentions these types of databases that might be used to collect information about entities: 1) an Entity Ranking Data Repository, 2) an Entity Ratings Data Repository, and 3) a User Interaction Database. That User Interaction Database would store “user interaction metrics generated from monitoring user interactions with search results associated with entities.” Not necessarily the type of user interactions that I wrote about in Satisfaction a Future Ranking Signal in Google Search Results? It makes sense that if you are going to base rankings on the sentiment expressed about entities, that you would also try to find a way to count user interactions if possible.

The patent tells us more about the entity sentiment database at the heart of it. It’s constructed out of tuples that have multiple parts, such as an Entity ID, and Entity Type and one or more reviews. The reviews would have a review ID, an Entity value, a sentiment value, and review text. The patent doesn’t tell us that the entity ID would be a machine ID like those from Freebase or Wikidata, but it sounds like it could be.

Review Tuples

The entity value is a score that is based upon the likelihood that the review is about that specific entity.

The sentiment value provides a score that the review contains some sentiment about the entity.

We are told about how review text might be processed before it is included in this sentiment database.

We are also told a little about how sentiment scores are generated based upon the type and the polarity of sentiment (positive or negative, using Heuristics to score and determine a magnitude of the strength of the sentiment).

Scoring could be based upon things such as: “polarity of the sentiment, the type of attitude expressed in the sentiment, confidence in the sentiment, identity of the source/author, the overall amount of sentiment-laden text identified, and relative importance of features about which sentiment is expressed.”

We don’t know how much of an impact these sentiment scores might have on ranking because they may be combined with other scores:

The Ranked Entities in the Entity Ranking Database are displayed responsive to search queries that reference the Entity Type. The Entity Rankings are used as signals to rank the set of Ranked Entities when displaying the Ranked Entities as search results. For example, a user who enters “sushi” as a search query will receive an ordered list of Ranked Entities of Entity Type “sushi restaurant” ranked according to Entity Ranking. According to the embodiment, the Entity Ranking can be combined with other signals to rank the set of Ranked Entities such as signals based on the number of times the Ranked Entity is mentioned on an index of web pages or the geographic location of the Ranked Entities relative to a geographic location of a user performing a search.

The patent also tells us that the “User Interaction Score” they refer to might also play a role in ranking after being created by using things such as user click through and time spent at web pages associated with Ranked Entities presented in search results. We don’t know this for certain, and we’ve been told by Google representatives that signals like these are often too noisy (and prone to abuse) to be relied upon in ranking results.

The patent provides details of other moving pieces that may play a role in ranking entities based upon sentiment expressed in reviews. We don’t know whether or not Google has implemented the process described in this patent, or whether or not they will. They do provide sentiment rich snippets of reviews in knowledge panels, but people seem to want to see that kind of information, so there is a value in Google showing it anyway, even if it isn’t used to rank entities.

About Bill Slawski

With more than 26 years of SEO experience and a Juris Doctor Degree, Bill Slawski is the foremost expert on Google’s patents as related to SEO. Patent Exploration is one of the quickest and most detailed ways to find new information about SEO. Bill is the Editor of SEO by the Sea, a prominent search engine optimization blog, where he is the author of over 1,300 posts. Bill’s experience includes Fortune 500 brands and some of the largest websites in the world. Bill is a contributing author for Moz, Search Engine Land, and Search Engine Journal. In 2014-2021, he spoke at industry-leading international conferences about topics including search engine algorithms, universal and blended search, personalization in search, search and social, and duplicate content problems, structured data, and schema