If you remember when the Hummingbird Update was announced at Google’s 15th birthday, they also displayed how they were responding to conversational search, and understanding pronouns in previous queries from query sessions. I wrote about them doing that in a post titled Searching with Pronouns: What are they? Co-references in Followup Queries
Google has been granted a patent within the past couple of weeks about how the search engine might understand and use conversational queries. The Hummingbird Update at Google was announced on September 26, 2013. This conversational Search patent was filed with the USPTO on July 31, 2014, and it provides a lot of details on how conversational search works at Google.
Conversational search is based upon natural language processing where the way people speak is used by a search engine to respond to questions that people might ask using natural language. This can involve content that is indexed that might be tagged in a way that enables a search engine to understand the context of a query. As the description from the patent tells us early on:
This specification describes retrieving and using contextual data from previous sessions in a conversational search by determining that a query refers to one or more tags in an index repository, determining one or more particular session identifiers associated with the tags in the index repository, retrieving particular contextual data associated with the particular session identifiers in a data repository, and performing an action responsive to the query based on the retrieved particular contextual data.
The patent tells us that the process behind this patent is innovative because a first query might refer to something in a previous session, such as a pronoun (“Who was Barack’s Wife?” followed by “When was her birthday?”) or a time (“What was the email I received yesterday from an airline?”)
These queries might be answerable by a search engine because the context of those queries is understood after all potential answers are tagged with context information.
Query Tags and Answering Conversational Search Queries
To understand what is happening in the process behind this patent, it helps to understand what is meant by “tags” when they are referred to in the patent. I liked this breakdown of tags into session tags or item tags. Keep in mind that tags are associated with conversational queries, and help with the context of answers to those queries:
In some examples, the one or more tags include at least one of a session tag or an item tag. The session tag may be associated with information about particular corresponding user sessions for the particular identifiers. The item tag may be associated with at least one of particular queries in the particular corresponding user sessions or particular search results responsive to the particular queries. The conversational search system may determine that at least one of the particular queries or the particular search results refers to a particular item corresponding to the item tag and act responsively to the first query based on the determined particular item. In some cases, before determining that the first query refers to one or more tags in a first repository, the conversational search system may determine that the first query lacks additional information required for acting and that the first query is not associated with a previous query for the first user session.
Advantages of Looking at Context from Previous Sessions
I enjoy patents that provide a list of the advantages of following a process described in those patents, and this one gives us such a list. Here are the benefits of Google following this patented process:
- A conversational search system can make conversational search smarter and more natural by quickly and seamlessly retrieving and using contextual data from previous conversational searches into current conversational search
- The conversational search system can determine relevant previous searches based on little, limited, or partial information that the user remembers, which minimizes user input and helps the user in the searches
- The conversational search system can efficiently provide quick responses, e.g., by associating short conversation sessions with session labels (e.g., time labels, user location labels, user action labels, and/or co-presence information labels) and/or item labels in an index repository and searching relevant conversation sessions in the index repository based on query segments corresponding to the labels, instead of searching all the previous queries and answers, which may result in a huge number of spurious matches when retrieving.
I also enjoy when a patent pulls into description the use of entities in a process involved by the patent, which this one also includes, and which provides more insight into how the process works, with sample queries that show off the scope of conversations:
The index repository may be keyed by entity annotations, e.g., “what book was I talking about this morning?” The user might have been using a book title without ever mentioning the word “book” in a previous query and the conversational search system may determine the word “book” based on entity annotations. The index repository may be also keyed by attributes of entities (e.g., addresses for restaurants) and other metadata for entities (e.g., customer ratings for restaurants). Fourth, the conversational search system can achieve high accuracy and reliability, e.g., by determining item labels for items such as entities based on an annotation database that stores millions of items and associated item labels. Fifth, the conversational search system can be extensible to any suitable language besides English.
The patent covering conversational search queries and the use of tags in applying natural language processing to answer those queries is:
Retrieving context from previous sessions
Inventors: Ajay Joshi
Assignee: Google LLC
US Patent: 10,235,413
Granted: March 19, 2019
Filed: July 31, 2014
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for retrieving and using contextual data from previous conversation sessions in conversational searches. In one aspect, a method includes receiving a first query for a first user session, determining that the first query refers to one or more tags in a first repository, the first repository associating respective identifiers to respective tags, each identifier representing a corresponding user session, determining one or more particular identifiers associated with the one or more tags in the first repository, retrieving particular contextual data associated with the determined particular identifiers in a second repository, the second repository associating respective identifiers to respective contextual data associated with corresponding user sessions represented by the respective identifiers, and acting responsive to the first query based on the retrieved particular contextual data.
The patent tells us more about when content is tagged, which is essential for being able to answer conversational search queries:
The search engine enables retrieving and using contextual data associated with previous user sessions in conversational searches. The search engine stores and indexes previous user sessions with associated tags in an index repository. When receiving a query in a new conversational search, the search engine determines that the query refers to one or more tags, e.g., a time tag, in the index repository, and then determines one or more previous sessions indexed with the tags. The search engine then retrieves contextual data associated with the determined previous sessions, e.g., in a data repository, and acts responsive to the query based on the retrieved contextual data.
This image from the patent provides some information about conversational searches at Google and different types of actions that may be associated with those queries:
More About Conversational Search
My write up of this patent is from the summary of the patent in the patent description. There is a section of the Description which is labeled “Detailed Description” which goes into more depth and detail if you want a sense of how all of the parts work together. For instance:
In some implementations, data is associated with the identifiers from the search requests so that a search history for each identifier can be accessed. The selection logs and query logs can thus be used by the search engine to determine the respective sequences of queries submitted by the user devices, the actions taken in response to the queries, and how often the queries have been submitted.
We do see references in this patent to previously mentioned entities and when they may be referred to by pronouns:
In some examples, the search engine determines whether a query is associated with a previous query by determining whether the query or a search result responsive to the query includes an explicit or implicit reference to the previous query. In a particular example, if the previous query refers to an entity and the query includes a corresponding pronoun for the entity, the search engine determines that the query is associated with the previous query. In another particular example, the search engine determines that the query is associated with the previous query if the query has the same intent as the previous query, e.g., using a template as described further below.
More details about tags and entities surface in the remainder of the patent as well:
The annotation database associates each item with one or more item tags, labels or annotations. The item tags may be associated with attributes or properties of the item. For example, if an item refers to a person, e.g., “Lady Gaga,” the item tags associated with the person may include a profession tag, e.g., “musical artist,” “composer,” and/or “popular singer,” a nationality tag, e.g., “U.S.,” or a gender tag, “female.” If an item refers to a book, e.g., “rich dad poor dad,” the item tags associated with the book may include a category tag, e.g., “book,” an author tag, e.g., “Robert T. Kiyosaki,” or a publisher tag, e.g., “Plata Publishing.” If an item refers to a restaurant, e.g., “Gary Danko,” the item tags associated with the restaurant may include a category tag, “restaurant,” a street tag, e.g., “Point Street,” a city tag, e.g., “San Francisco,” or a cuisine type tag, e.g., “French.”
Added December 13, 2019. Another patent has been granted to Google about their Automated Assistant, which I wrote up in the post The Google Assistant and Context-Based Natural Language Processing. It’s interesting seeing how the pieces behind the Google Assitant are fitting together.