Google Safe Search Results Patent and Reranking or Removing Results

by Posted @ Oct 10 2018

Twitter

How does Google Use a Safe Search Results Filter to Handle Inappropriate, Sensitive, or Offensive Search Results?

One of the problems that searchers may have with the internet is being surprised by content that they don’t expect to see, or be willing to expose other people to. A recently granted patent at Google tells us about the search engines’ efforts to protect searchers from such content. As they phrase it within that patent:

Internet users can search for various types of content using search engines. Internet content may include sensitive or offensive content such as, for example, pornography, gory images, and violent content. In some cases, users may involuntarily be exposed to inappropriate sensitive or offensive content. Accordingly, it may be desirable to limit exposure to inappropriate sensitive or offensive content available on the Internet.

The patent refers to Safe Search Results a number of times, and I remember hearing about Google’s Matt Cutts being responsible for Google showing Safe Search Results before he became Google’s Head of Spam. I wondered if there was a patent on Safe Search Results, or if he may have had anything to do with that, but I hadn’t seen one. This one mentions Safe Search Results enough times that I’m likely going to think of it as Google’s Safe Search Results Patent.

The purpose behind the Google Safe Search Results patent is to protect searchers using the Google Search engine from being exposed to content that they might not want to see:

This disclosure generally describes a method and system for applying classifiers to search queries and search results to provide a search experience in which users are protected from exposure to inappropriate offensive or sensitive content.

Exactly how do such safe search results classifiers work?

1) In response to a query, a search engine will return a preliminary set of SERPs.
2) The query is classified by a classifier to determine whether it includes one or more terms associated with a protected class of people or terms associated with sensitive or offensive content, such as pornographic or violent content.
3) The preliminary results are also classified to determine whether they contain sensitive or offensive content
4) Search results are returned to a searcher so that inappropriate sensitive or offensive content is not shown to a searcher.

The searcher does receive an indication that the search query is classified to include query terms that

(i) do not likely relate to a particular class of people,
(ii) likely relate to the particular class of people, or
(iii) likely relate to the particular class of people and include sensitive or offensive terms.

Also, they will be told that the search result is classified as likely including

(i) non-sensitive and non-offensive content, or
(ii) sensitive or offensive content.

From the candidate set of search results, a presentation set of safe search results is selected which will show at least on
(I) the indication that the search query is classified as including query terms that

(i) are not likely related to the particular class of people,
(ii) are likely related to the particular class of people, or
(iii) are likely related to the particular class of people and include sensitive or offensive terms, and

(II) the indication that the search result is classified as likely including:

(i) non-sensitive and non-offensive content, or
(ii) sensitive or offensive content. The one or more search results of the presentation set of search results are provided for output in response to the search query.

For some of those results, there is a particular class of people affected which includes a group of people having at least one demographic characteristic in common.

And in some results, the sensitive or offensive terms may include terms associated with one or more of pornography, violence, gore, and spoof. The sensitive or offensive content includes images, video, or data associated with one or more of pornography, violence, gore, and spoof.

In some cases, the selection of the presentation set of safe search results from among the candidate set of search results includes one or more of the following actions:

In some cases, a ranking of a search result in the candidate set of search results is reduced based on

(i) the indication that the search query used to obtain the search result is classified as likely related to the particular class of people, and
(ii) the indication that the search result is classified as likely including sensitive or offensive content.

In some cases, a search result in the candidate set of search results may be filtered to remove the search result from the presentation set of search results based on

(i) the indication that the search query used to obtain the search result is classified as likely related to the particular class of people and including sensitive or offensive terms, and
(ii) the indication that the search result is classified as likely including sensitive or offensive content.

In some cases, a search result in the candidate set of safe search results may be selected to be included in the presentation set of search results without modifying a ranking of the search result or filtering the search result based on the indication that the search query used to obtain the search result is classified as not likely related to a particular class of people and as likely including non-sensitive and non-offensive terms.

In some cases, the selection of the presentation set of search results from among the candidate set of search results includes one or more of the following actions:

(1) In some cases, a search result in the candidate set of search results may be selected to be included in the presentation set of search results without modifying a ranking of the search result or filtering the search result based on the indication that the search query used to obtain the search result is classified as not likely related to a particular class of people and as likely including sensitive or offensive terms.

(2) In some cases, a search result in the candidate set of search results may be filtered to remove the search result from the presentation set of search results based on the indication that the search query used to obtain the search result is classified as likely related to the particular class of people and as likely including sensitive or offensive terms.

(3) In some cases, the actions of the computer-implemented method may also include generating a relevance score for a document corresponding to each search result in the candidate set of search results, determining a ranking for each search result in the candidate set of search results, and receiving user session data that includes one or more attributes of a user device.

The relevance score is indicative of a relevance of the document to the search query. The selection of the presentation set of search results from among the candidate set of search results further includes modifying rankings of one or more search results in the candidate set of search results based on the user session data that includes one or more attributes of the user device.

In some cases, the selection of the presentation set of search results from among the candidate set of search results may include, for each document corresponding to a search result, assigning a label to the document based at least on the indication that the search result is classified as including sensitive or offensive content, and determining to filter the search result or modify the ranking of the search result based on the assigned label. The label is indicative of subject matter included in the document.

The patent where all of these filtering or inclusions may be made is described in:

Protecting users from inappropriate sensitive or offensive search results
Inventors: Matthias Heiler, Michael Schaer, Nikola Todorovic, Robin Nittka, Thomas Fischbacher and Laura Dragoi;
Assignee: Google LLC
US Patent: 10,083,237
Granted: September 25, 2018
Filed: August 31, 2015

Abstract

A system and method for providing a search experience in which users are protected from exposure to inappropriate offensive or sensitive content is described. A search system may classify a search query and candidate search results obtained in response to the search query. Based on the classification of the search query and search results, the candidate search results may be modified to generate a set of search results presented to a user such that the presented search results do not include inappropriate sensitive or offensive content.

Some definitions under the Safe Search Results Patent

The patent defines “sensitive or offensive content” for us, as referring to, but is not limited to, pornography, gory images, and violent content.

It also defines “inappropriate sensitive or offensive content” as a subcategory of the sensitive or offensive content, and may include content such as:

  • Gang recruitment content
  • Violence inciting content
  • Content mocking a particular demographic group or inciting hatred against a particular demographic group
  • Spoof content

“It may also generally refer to any content that is illegal, abusive, or highly objectionable to a protected class of Internet users.”

The patent also tells us that it aims at protecting searchers who might be sensitive to some search results:

That protected class of users may include any group of people having at least one demographic characteristic in common and for whom protection from inappropriate sensitive or offensive Internet content may be desired.

How Content is Treated for Teenagers as a Protected Class

The Safe Search Results Patent’s description starts off with an example of search results that target teenagers in response to a query such as “Why teenagers join groups”. The search engine receives that query, obtains a set of Search Results, to show in response to that query.

The search engine also receives a number of classification signals, and it selects a set of search results from the set of candidate search results to present to a child searcher, based on the classification signals.

One of the search results entitled “Teen Recruitment” is given a relevance score of 96 and assigned a label indicating that the search result document corresponding to “Teen Recruitment” includes content that can be presented to all users, including a child user, and does not include sensitive or offensive content.

Another result is entitled “Teen gang recruiters” with a relevance score of 87 and assigned a label “v” indicating that the search result document corresponding to “Teen gang recruiters” likely includes violent content or disturbing images.

Some other search results are entitled “News: Teen Groups” and “Groups of kids” which have relevance scores of 79 and 34, respectively, and with assigned labels indicating that the search result documents corresponding to “News: Teen Groups” and “Groups of kids” and they likely include content that can be presented to all users, including a child user, and they don’t include sensitive or offensive content.

Based upon the classification signals used, some results may be approved, some may be removed, and others may be reranked before they are presented to a child searcher.

Some of the results may be reranked to be presented higher in search results, such as the one about “News: Teen Groups”.

Some of the results may be reranked to a lower ranking, such as the one about “Teen gang recruiters” that is likely to have violent content associated with teens.

The Safe Search Results Patent tells us that the focus of this approach is:

The selected set of search results are then output as a presentation set of search results at the user device such that the child user can enjoy a safe search experience without exposure to inappropriate sensitive or offensive content.

Other Protected Groups and Other Filters

The safe search results patent provides other examples, for other groups, such as results about

(i) “Patent attorney spoof” with a relevance score of 96 and a label indicating that the document associated with the search result entitled “Patent attorney spoof” likely includes inappropriate sensitive or offensive content such as spoof content associated with a protected class of people (e.g., patent attorneys).

(ii) “Pranks on IP practitioners” with a relevance score of 92 and a label indicating that the document associated with the search result entitled “Pranks on IP practitioners” likely includes inappropriate sensitive or offensive content such as insulting jokes directed to a protected class of people;

(iii) “Funny patent attorney moments” with a relevance score of 89 and a label indicating that the document associated with the search result entitled “Funny patent attorney moments” does not likely include any sensitive or offensive content; and

(iv) “Humor an attorney” with a relevance score of 74 and a label indicating that the document associated with the search result entitled “Humor an attorney” does not likely include any sensitive or offensive content.

The patent tells us which content from these queries has likely be removed, or reranked.

The patent also includes other examples covering such things as Political jokes, Political Memes, and Political Scandals.

It also provides more details about how the search engine identifies content that it would want to filter.

Biggest Takeaway From the Safe Search Results Patent

I hadn’t seen Google saying anything about removing or reranking search results based upon providing safe search results involving protected groups before. I do remember though a legal case against Google where a Federal Court decided back in 2003 that PageRank was speech protected under the First Amendment.

subscribe to our newsletter

Leave a Comment