Expanding Queries by Rewriting Keywords
Google’s patent is from March 9, 2021. It covers rewriting keywords to expand queries and return a wider range of search results in response to what a searcher has searched for. The patent appears to cover paid search results, even though the keyword rewriting approach has been in organic SEO for a long time.
Rewriting keywords is how Rankbrain works and how Hummingbird works. Neither of those algorithms changes the page returned in search results. Both of them change the query used to find those pages by rewriting keywords.
Google also is rewriting keywords by adding substitutions of synonyms to queries. I described this in How Google May Substitute Query Terms with Co-Occurrence.
Purpose Behind Searches
It is often desirable to pair documents with a requested document on the Web, such as a web page.
A document viewer’s experience may get enhanced by content that is of interest to the viewer.
For example, a requested web page may get displayed along with additional content items.
The content can get selected based on various criteria, often with the end goal of soliciting a searcher reaction, such as exploring the content via a click-through.
This patent works with ways to evaluate information in a network.
A data processing system can get or receive a content placement criterion, such as a keyword, associated with content items and can determine a quality metric of the keyword.
At the Google Adwords API page, we see language such as “content placement criterion” used to refer to keywords. This is another sign that this patent is focusing on the rewriting of keywords in paid search.
Search Based on Keywords and The Quality of those Keywords
The search engine can identify a candidate keyword. It can rewrite keywords associated with the content item to include the keyword and a rewritten keyword. That can work, at least in part, on a quality metric of the keyword.
Webpages get indexed by Google in an inverted index. When you search, Google looks for pages that contain the keywords you have searched for. The search engine may add query terms in a boolean way, with “or” our “and” between some of those keywords, or use some rewritten keywords. This means a wider range of pages can rank for different queries.
You can expand keywords based in part on a throttling parameter.
This data processing system can identify a correlation between a document and the keywords to identify appropriate content items for the document.
These content items may include text, images, and media elements such as an image, video, or interactive features such as a game.
Content Items in Documents
A content provider may provide content items to get delivered with documents from content publishers.
The placement system may determine what content items to deliver based on keywords.
Successful keywords may place a content item and get an expansion to include more keywords derived from a successful placement criterion.
At least one aspect gets directed to evaluating information in a computer network environment by obtaining a keyword associated with a content item. It will then determine a quality metric of the keyword. It could also identify a candidate keyword and expand keywords associated with the content item to include the keyword and the candidate keyword based on an evaluation of the quality metric of the keyword. The quality metric can get based on a click-through rate. The method may further include determining the click-through rate of the keyword corresponding to the content item.
A Quality Metric of a Candidate Keyword
At least one aspect gets directed to a system of providing information via a computer network. The system may include a data processing system with at least one content placement module and a quality metric module. The data processing system may receive a keyword associated with a content item. It may then determine a quality metric of the keyword. A candidate keyword could expand the keywords associated with the content item to include the keyword and the candidate keyword, based, at least in part, on the quality metric of the keyword.
At least one aspect works with a computer-readable storage medium with instructions to provide a computer network. The instructions would include instructions to receive a keyword associated with a content item. They would determine a quality metric of the keyword. They would also identify a candidate keyword.
Next, they would expand the keywords associated with the content item to include the keyword and the candidate keyword-based evaluation of the quality metric of the keyword. The instructions may include instructions to determine a throttling parameter of the content item and expand the keyword to include the candidate keyword based on evaluating the throttling parameter.
Expansion of high performing keywords
Inventors: Gaofeng Zhao, Yingwei Cui, Hui Tan, Bahman Rabii, and Wei Chai
Assignee: Google LLC
US Patent: 10,943,259
Granted: March 9, 2021
Filed: April 19, 2019
Systems and methods of evaluating information in a computer network environment get provided. A data processing system can obtain or receive a keyword, such as a keyword, associated with a content item and determine a keyword’s quality metric. The data processing system can identify a candidate keyword and expand keywords associated with the content item to include the keyword and the candidate keyword-based at least in part on an evaluation of the quality metric of the keyword. The data processing system can expand keywords based in part on a throttling parameter. The data processing system can identify a correlation between a document and the keywords to identify appropriate content items for the document.
Rewritten Keywords to Include High-Performing Keywords
A search returns both search results and paid ads based on the query. I looked up the inventors of this patent in LinkedIn, and one was a long-time Vice President at Google. A couple of others now work at Waymo, the automated car division of Google, but before that worked in Ads at Google. Some of the language behind this patent sounds like it has something to do with ads, even though we have seen rewriting keywords at Google since 2003.
When I search for “High Performing Keywords” in Google, I see references to keywords that get clicked on a lot. Usually, when it comes to advertisements. One of the quality metrics focused on in this patent involves CTR. It’s not odd that the keyword rewriting aspects of this patent would apply to search results. It’s also not unusual that processes like synonym substitution and query completion as applied to SERPs could also be in ads.
Below are ways for expanding keywords to include high-performing keywords. While this patent tells us about the keywords rewriting approach used in search, it tells us that keywords can be rewritten for advertisements.
High Performing Keywords and Content in Patents Mean Paid Search
Also, when Google patents refer to “content,” they usually don’t mean the content that appears on unpaid search pages. It’s usually a reference to “content providers” or “content sponsors,” as it is in this patent. That is why this phrase from the patent tells us this:
A content provider such as a content sponsor may provide content items with documents from primary content publishers. When a web page or other document gets requested, a content placement system can determine content items to present with the document. The placement system may determine what content item or items to deliver based on keywords. The placement system may identify a correlation between the document and a placement criterion associated with a content item and select that content item for delivery.
So this patent targets paid advertisements returned for a query. It tells us that Google may rewrite keywords for the paid results as well as the SERPs. The patent and the backgrounds of the inventors make that obvious. Here is more language from the patent about “content.”
One example of a placement criterion is a keyword. A content item may include text, images, and media elements such as an image, video, or interactive feature (e.g., a game). Content items may provide information enhancing other content within a document, e.g., primary content.
Content items may get related to the primary content (e.g., behind-the-scenes media), include material related to the source of the primary content (e.g., highlights of other content from the source), include material that may be of interest to the searcher based on searcher-related factors (e.g., a relevant weather report), provide offers for the searcher to participate in a transaction, or provide media elements such as a video player or interactive game.
A placement criterion may be successful and lead to the placement of a content item with a web page where the web page viewer is likely to respond to the content item. The keywords can also expand, e.g., from criteria provided by the content provider to the content placement system to include more keywords.
The additional keywords can get derived from the successful placement criterion. The content item may be part of a content group where the keywords get affiliated with the group as a whole. In this example, successful keywords for a content group from the content group may expand to include more keywords for more content items or other content in the group.
Providing Content to a Searcher Device
The network can include computer networks such as the Internet, local, metro, or wide area networks, intranets, and other communication networks such as mobile telephone networks. The network can access web pages that can show on at least one searcher device, such as laptops, desktops, tablets, personal digital assistants, smartphones, or portable computers. For example, via the network, a searcher of the searcher device can access web pages provided by a content publisher.
The system includes at least one data processing system that, for example, can include a processor or other logic device to communicate via the network with at least one content publisher and at least one supplemental content provider. The data processing system can include a content placement system configured to test and include candidate keywords as keywords used to place content items with web pages or other documents via the network. The data processing system can also include at least one content placement module, at least one quality metric module, and at least one data repository or database. The data processing system can include or communicate with a content selection server, a content host server, a content placement server, and other data processing systems.
The data processing system includes an interface configured to receive a request via the network to identify content items for delivery to a web page. The data processing system may receive the request in real-time. This could be after the searcher device requests access to a web page before the web page is fully rendered on the searcher device. The activity of searchers on the network, requesting web pages, can be anonymous. Individual searchers cannot get identified from the maintained searcher activity. Furthermore, searcher activity information can get collected on an opt-in or opt-out basis.
Searchers can control the collection of their activity information. The searcher can get represented by identifiers associated with the searcher device, for example, using a cookie, without regard to the actual identity of the person accessing the searcher device. Identifiers may get selected containing no personally identifiable information about the searcher or the searcher device. For example, a random number may get used. A searcher may affirmatively opt to provide personally identifying information such as a name, nickname, e-mail address, or other identifying feature to the data processing system. A searcher may benefit without providing any such information to the data processing system.
An Illustrative Computer System
The content placement module or the quality metric module can each be part of at least one computer system. The example computer system includes processors in communication, via a bus, network interfaces (in communication with the network), I/O interfaces (for interacting with the searcher), and internal memory. In some implementations, a processor incorporates or is directly connected to the additional cache memory.
The searcher device includes a computer system. For example, a searcher can interact with an input device, e.g., a keyboard, mouse, or touch screen, to request a web page to get delivered over the network, received at the interface, and output via an output device. The request can get processed by the data processing system, which may include a computer system, for example, to identify content items from the content provider for delivery with content from the content publisher based on keywords.
When a web page gets requested by a searcher device, a content placement system, e.g., data processing system, determine content items to present with the requested web page. The placement system determines what content items to deliver based on keywords.
A Placement Criteria is a Keyword in PPC
A placement criterion, such as a keyword, can include more than one term, component, number, or word. The data processing system uses at least one keyword to match the content item with a web page (or other documents). One placement criterion can include a phrase having more than one word, e.g., “BrandName Product.”
In an example usage, the data processing system receives the notification of a request for access to a web page. The data processing system can compare the keywords with features of the web page, including the title, content, or metadata, and determine if the content item associated with the criteria should get provided for display with the web page. The keywords may be the words “BrandName Product,” and a placement server of the data processing system gets configured to place an associated content item on a page that includes both “BrandName” and “Product.”
The placement system may determine that content item to deliver with a particular document or web page based on many keywords associated with many content items. For example, the document can get parsed into a set of tokens. The tokens may get compared with the keywords to identify associated content items. The keywords may include more keywords. The placement system may determine the content item or items to deliver based on original keywords or the expanded additional keywords.
The set of tokens parsed from the document may get filtered to determine a sub-set of useful tokens for selecting content items. The set of tokens may get expanded, like that described for keywords, to include more tokens related to the tokens. The data processing system may use the set of tokens, or the set of expanded tokens, in determining the content item or items to deliver with the document.
Relationships Between Content Items and Rewritten Keywords
Content can get associated with many criteria. For example, content gets associated with a first criterion and a second criterion. And content may get grouped. For example, content may get grouped with similar or related content into a content group. The content group may constitute content items and exclude some content, e.g., unrelated content. The group may get associated with at least one placement criterion. The constituent content of the group may be individually associated with respective criteria, or, in some implementations, the content group may get associated with a group of keywords.
The keywords can get expanded by the data processing system to include the criterion and at least one expanded criterion. In this example, the expanded criterion may get associated with the original criterion. The expanded criterion gets associated (e.g., as represented by identifier) with the content. In this example, the keywords associated with the content get expanded to include both the criterion (e.g., received at the data processing system from the supplemental content provider) and the expanded criterion (e.g., determined by the data processing system).
The expanded criterion can get added (as indicated by identifier) to the keywords for the content or the content group. In some implementations, a criterion gets expanded to many criteria. For example, a criterion can expand to a first expanded criterion and a second expanded criterion. In some implementations, many keywords get expanded into a single expanded criterion.
Keywords and Content May be Evaluated to Determine if search results should contain Rewritten Keywords
For example, a first placement criterion and a second placement criterion can get evaluated by the data processing system to generate an expanded criterion. These are examples of relationships between content. The contents of content group, keywords, and criteria expansions, can include various forms of content, content groups, or keywords or other keywords having other relationships.
Original keywords may expand to include more criteria when the original keywords get determined to be of enough quality about their associated content. For example, the data processing system can determine that a placement criterion is successful if it leads to the placement of content items with a web page where the web page viewer is likely to respond to one of the content items.
The data processing system can predict or estimate the value of a placement criterion by determining at least one quality metric of the placement criterion. For example, the quality metric module can determine a placement criterion’s click-through rate (or another performance trend). Keywords already associated with a content item get expanded to include a candidate keyword when the quality metric for the keywords already associated with the content item exceeds a threshold, such as a sufficiently high click-through rate (CTR).
A Quality Metric Based On Click Throughs
In one example of using a quality metric based on a click-through rate (CTR), the CTR can test the historical click-through rate of content items placed on web pages due to a correlation between a keyword or other placement criterion and the content of those web pages. The content items used in generating a CTR may get selected from a content group rather than all available content items.
For example, the quality metric can get based on the historical performance of a placement criterion (e.g., a keyword) relative to its corresponding content item or content group, as opposed to a more generic quality metric of keyword performance relative to other content such as different content items, e.g., content, or content items that are not part of the content group such as unrelated content.
The content group may get based on similarity of product, content, sponsor, genre, or any other classification scheme. A content group may get formed through an automated process, by a specialist, or even by specification from the content item’s source. A quality metric based on CTR may also consider additional analysis based on statistical models to infer searcher behavior. Quality metrics based on CTR generally estimate if a keyword is useful. For example, the searcher is likely to click on a content item placed on a web page due to a correlation with that keyword. Other quality metrics, such as metrics based on click-through get associated volume, can predict the value of candidate keywords such as keywords.
The data processing system obtains at least one keyword (e.g., placement criterion). The data processing system can get the keyword from the supplemental content provider or the database. The keyword can include keywords or other keywords associated with a content item, content group, or other content. The content placement module can use the keywords to match its associated content item with a web page, and the content item can get provided for display with the web page at the searcher device.
A Quality Metric of Keywords
A data processing system determines the quality metric of the keyword.
The quality metric module can determine the keyword’s click-through rate relative to its associated content.
The same keyword may have different click-through rates (or other quality metrics) for various content items.
For example, a keyword may be highly effective for placing one content item, but the same keyword can be ineffective when used to place a different content item.
The data processing system determines a quality metric for the keyword relative to the corresponding content.
The data processing system can get, e.g., from the supplemental content provider, the keyword “tropical beach vacations” associated with a particular content item (or group of content items) for an exclusive Caribbean resort. In this example, the data processing system can determine the click-through rate of the keyword “tropical beach vacations” relative to the particular content item for the Caribbean resort.
Candidate Keyword Criteria
The data processing system can also identify at least one candidate keyword (e.g., criteria expansions). The candidate content keywords can get derived from the obtained keywords that get associated with the content item. For example, from the placement criterion “tropical beach vacations,” the data processing system can identify the candidate keywords “beach vacations,” “tropical beach,” “beach holiday,” “island getaway,” or “honeymoon destination.” The candidate keywords generally include keywords, terms, or other criteria that may assist with the placement of the content item associated with them (e.g., original) keywords. The candidate keywords (e.g., “beach vacations”) can be terms of the original keywords (e.g., “tropical beach vacations”) or different terms having a degree of semantic or subject matter similarity with the original keywords (e.g., “beach holiday” since “holiday” is sometimes synonymous with “vacation”).
A data processing system expands criteria associated with the content to include the candidate keyword. For example, the data processing system can evaluate the quality metric of the placement criterion associated with the content item. It can associate the candidate keyword with the content item when the quality metric of a placement criterion associated with the content item satisfies a threshold. For example, the candidate placement criterion can get added to keywords of the content item when the click-through rate of the placement criterion relative to its associated content item is above a threshold level.
Click Through Above a Threshold Percentage
Concerning the above example, when the click-through rate of the keyword “tropical beach vacations” relative to the particular Caribbean resort content item is above a threshold percentage, the data processing system can expand keywords for the Caribbean resort content item to include at least one candidate keyword such as “beach holiday” or “island getaway” in addition to the criterion “tropical beach vacations.” In this example, the content placement module can use any of these criteria to place the Caribbean resort content item on a web page via the network for display at the searcher device.
The Budget For Placement of a Candidate Keyword
In some implementations, the data processing system determines a throttling parameter of the content. It expands the keywords of the content item based at least in part on the throttling parameter. The throttling parameter can get based on a monetary budget for content item placement. For example, an entity associated with the data processing system or the content publisher may charge the supplemental content provider a fee to place content on web pages. Excessive content placement may prematurely exhaust the budget.
The data processing system can determine that the budget for placement of a content item is insufficient or is being depleted at a sufficiently high rate that expanding the keywords to include additional criteria is not warranted. In this example, the data processing system can exclude a candidate keyword from association with content in order, for example, to conserve the budget dedicated to the placement of that content item.
The candidate placement criterion can get excluded despite the quality metric of the (e.g., original) keywords having a high click-through rate or otherwise satisfying a quality metric threshold. In some examples, the data processing system determines that a budget (or other throttling parameters) is enough and proceeds to associate candidate keywords with the content item.
An Example Method for Keywords
The method obtains at least one keyword associated with the content and determines a quality metric of the keyword. In some implementations, the method identifies a candidate keyword, for example, by expanding the obtained criterion. In some implementations, the method expands the keywords associated with the content, including the keyword and the candidate keyword.
It may expand the keywords associated with the content to include the candidate keyword when the determined quality metric of the keyword indicates a valuable or successful placement criterion. In some implementations, additional parameters get considered. For example, in some implementations, a throttling parameter can limit criteria expansion.
Determining A Quality Metric Of the keyword
Accordingly, in some implementations, determining a quality metric of the keyword includes:
- An analysis of the criterion
- The objectives of the content item’s source
A quality metric could be a click-through rate (CTR), as discussed above. Other quality metrics, such as a click-through volume, can predict the value of candidate keywords such as keywords. A quality metric may be a composite, e.g., a CTR adjusted by more metric parameters. For example, the additional metric parameters may be a weight reflecting the frequency a candidate placement criterion gets used in placing content items or the frequency a candidate placement criterion occurs in documents with which content items may get delivered.
When the original keywords are sufficiently effective, the keywords specified by the content provider may get expanded. A criteria expansion system can generate expanded criteria from the original keywords specified by the content provider. For example, an original criterion for a content item highlighting a BrandName product may be the keyword “BrandName Product.” From the keywords, the expanded criterion “BrandName” can get derived. This expanded criterion can compare with web page content to place the content item corresponding to the “BrandName Product” criterion.
How Rewritten Keywords Get Found
Expansion of original placement criterion may get automated through:
- Phrase parsing. This is using a subset of words found in the criterion or keyword
- Dictionary substitution. Using synonyms
- Brand correlation substitution. This would be using a table of model names for a BrandName product
- Semantic similarity
Manual Rewriting Keywords
Criterion expansion can be manual. For example, a specialist may have additional placement criterion suggestions not supplied by the content item’s source. An expanded criterion may, but need not, include elements from the original criterion, such as terms of a multi-term keyword. Expanded criteria can be in a repository (e.g., a database) and retrieved to assist with the placement of content associated with the original placement criterion.
Varying or Restricting Keywords
It could be useful to vary or restrict the use of expanded keywords. The click-through rate of a content item might not expand an ineffective keyword. This is where the determined quality metric of the original keyword is below some threshold value. If the CTR of the original placement criterion is low, then the candidate placement criterion might not supplement the original criterion.
The original keyword is not very good if it has a low CTR. In that case, it is generally not beneficial to add a similarly ineffective expanded keyword to expand an ineffective original keyword. If the CTR is high, then the candidate placement criterion can supplement the original keyword for content item placement. The expanded placement criterion can complement the original placement criterion when the CTR of the original placement criterion for its corresponding content is sufficiently high.
High and Low CTR Can Get Determined by Comparison with a Threshold Value
The patent tells us that expanded keywords may be overly successful. A content item provider (e.g., a sponsor) may wish to control the number of presentations for a content item using a throttling parameter. These can maintain a preferred click-through rate. A sponsor may want to limit the CTR for many reasons. These can include a sponsorship budget. Or they could involve load balancing for a landing server or other limitations. The content item may include an offer of a special deal for a fixed number of participants. The data processing system can determine if expanded keywords should get added to the original keywords in these examples.
For example, a sponsor can provide the largest budget to spend for content item placement. The data processing system deducts a certain amount from that budget each time the corresponding content item is on a web page. Keyword traffic may get restricted by using a throttling parameter to block candidate keywords and conserve the budget or when the budget gets exhausted. For example, if a content item runs out of budget, expansion can get reduced or otherwise restricted. This generally prevents the content item from getting placed on a web page based on anything other than a match with the original criterion provided by the sponsor. The expanded criteria can complement the original criteria when the budget for content item placement is sufficiently high. That means above a threshold value.
Paid Search Becoming More Like Organic Search
We have seen paid search results that include sitelinks. I’ve written about what looks like a merger of organic and local results in a patent on both, which I called Location Extensions Augmented Advertisements. This patent tells us about how query terms might be rewritten to show searchers results that are very close in meaning to what they originally searched for.
Though I do not often use paid search, it is a service that Go Fish Digital offers. It is very positive to see that it is evolving along with organic search. This means that searchers are more likely to find what they are looking for when searching regardless of whether they click on an ad or an organic search result.