In August of 2013, Google introduced a search feature they called In-Depth articles, writing about them in Google’s Inside Search blog, in a post titled Discover great in-depth articles on Google. This addition to the search results is a dedicated block that highlights one or a few articles that go deeper on a specific topic.
I was searching through Google’s patent applications recently and found one that covers these in-depth articles, and thought it was worth sharing a look behind the curtain at what Google says about these more detailed articles. And, my interest has been particularly piqued by them recently because they are appearing less frequently in the results.
For starters, Google’s Help section explains how Schema markup could help get these special results to show up for your pages, titled, Appearing in the “In-depth articles” feature. Were In-Depth articles too difficult for content creators to make?
An In-Depth Article Index for Queries
Whether an in-depth article may be shown in search results is dependent on:
- the search results being appropriate for it for a particular query
- those results being associated with a publishing entity that has been identified as a “stellar source,” and
- Google having determined that there is at least one or more in-depth article search results that should be provided in search results for that query.
This helps us better understand what qualifies an article as an “in-depth Article”, and what defines a “Stellar Source” is further defined in their patent. This also means that Google would have an index of In-Depth articles to show in search results for different queries. The patent discusses how it might provide scores for in-Depth Articles:
Identifying and Scoring In-Depth Articles
A web page might be seen as an in-depth article based upon it reaching an in-depth article score, which would include:
(1) Determining the article exceeds a threshold in-depth article score;
(2) The article being shown on websites from a number of seed websites based upon a similarity with other websites (based upon determinations like being created by publishers that are stellar sources);
(3) Information about the article includes Google having determined:
(a) An in-depth article score determined for the in-depth article,
(b) A uniform resource locator (URL) associated with the in-depth article,
(c) At least a portion of text of the in-depth article,
(d) An author of the in-depth article, and
(e) A publishing data of the in-depth article; and
(f) Each in-depth article score is based on one or more sub-scores including at least one of
- An article score,
- A commercial score,
- An evergreen score,
- A site pattern score, and
- An author score.
The Recently Published Google Patent Application
Surfacing In-Depth Articles in Search Results
Filing Date: June 26, 2015
Publication Number: 20150379140
Publication Date: December 31, 2015
Inventors: Anand Shukla, Pavan K. Desikan, Isabelle L. Stanton, Salvatore J. Candido
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing actions of determining that one or more in-depth article search results are to be provided in response to a query, obtaining a topicality score for each in-depth article of a plurality of in-depth articles, each topicality score indicating a degree of relevance of a respective in-depth article to the query, obtaining a document score for each in-depth article of the plurality of in-depth article, each document score is based on a respective topicality score and a respective in-depth article score, selecting one or more in-depth articles from the plurality of in-depth articles based on respective document scores, and providing the one or more in-depth article search results for display, each in-depth article search result representing an in-depth article of the one or more in-depth articles.
These “in-depth” articles had been intended to be insightful on a topic, or “a well-researched article that provokes deeper thought in readers” or possibly could be considered “read-to-learn” articles.
To get a taste of the intent behind such articles, we were fortunate in that the patent provided some examples of articles that might be shown as in-depth articles based upon the topic of “university admissions.” They were:
- Race and College Admissions, Facing a New Test by Justices
- Getting In
- The Myth of American Meritocracy
The patent tells us of them, with these words:
These example articles, provide insightful perspective on university admissions, and provoke deeper thought in some readers. Example articles that would not be described as in-depth articles with respect to university admissions include web pages of one or more universities describing respective admissions processes, and a web page published on an encyclopedic web site that generally describes university admissions processes.
In-depth articles might be taken from a set of seed websites, which are sites that are “associated with an entity that is known to publish quality content.”
These seed websites may be selected to be included in this set of seed websites based upon certain criteria, such as whether the entity underlying the website has achieved one or more accolades such as winning an award, such a Pulitzer Prize or a national magazine award, possibly within a certain period, like within the past ten years.
These sets of sites that might be targeted to contain in-depth articles may have articles from them scored to identify content that might qualify as in-depth articles. Those scores would be based upon a number of subscores, that might be based upon things such as:
- An article score
- A commercial score
- An evergreen score
- A site pattern score
- An author score
An Article Score is a degree to which the content of a URL could be considered long-form content. Long-form content is content that has a certain length, e.g., word count and/or number of pages, that is greater than a threshold length. The patent provides us with an example of what long-form content might be, telling us that “long-form content is often written in a narrative style with longer paragraphs.” An article score might be based upon:
- the word count of the content,
- the number of paragraphs in the content,
- the word count of respective paragraphs of the content, and/or
- the location of the respective paragraphs.
An article might sometimes reside on more than one URL, and the article score might be based upon the scores across the multiple URLs.
A Commercial Score might be provided by what the patent refers to as a commercial scoring engine, which would determine a score based upon “term, phrases and/or interaction elements provided in the content of a URL.” These would be “terms and/or phrases that indicate a commercial context, e.g., offer, sale, discount prices, and/or interaction elements, e.g., text boxes for entering credit card information, indicate commercial character.”
An Evergreen Score may indicate how much content is determined to be so-called evergreen content. Evergreen content may “include content that may be relevant regardless of age. That is, for example, the content is still relevant and interesting despite the publication date. In some examples, evergreen can be described as sustained interest over time.”
The patent tells us that a determination of the degree of evergreen content might be scored based upon the publication date of content and anchor text distribution to the article over time. Have other sites been linking to the article over some time, and does that show a sustained interest in the content over time? For example, an article that has been linked to 400 times over a long period is likely more evergreen than one that has been linked to 10 times in the same short period.
The patent discusses a predictive model based upon this evergreen score, the publishing entity who wrote it, and the length of the article and site URL patterns (like if it is in a body of articles on a site) associated with, to determine an overall IDA (In-Depth Article) score for an article. This overall scoring can be used to see if an article meets a certain threshold score and whether it qualifies as an in-depth article or not.
In-Depth Articles are determined to be from entities that are determined to be “stellar sources” based upon the content that they have written and whether or not what they have written has been shared or linked to “by other, reputable entities, and how often the content was linked to with positive anchor text, e.g., “great article,” “feature article,” “detailed piece.” The search engine may list one or more reputable entities in a table of reputable entities, with those stellar sources being represented as part of a site pattern: “e.g., www.qualitypublisher.com/reporting, www.qualitypublisher.com/magazines/articles, www.greatpublisher.com/stories.”
If these stellar sources, and the topics they write about, and the site patterns they cover appear in results for a query, articles by them might appear as in-depth articles in those search results (document scores might be calculated to determine if they should be shown)
Some queries may be considered queries that indicate that a searcher is potentially interested in learning more in-depth information about a particular topic. The patent tells us that sometimes a second search from a searcher may be considered a trigger query based upon whether it follows a certain first query:
In some examples, depending on the query that a user submitted, which resulted in an in-depth article being surfaced in search results, a query that the user is likely to submit next may be influenced by the user’s desire to learn more about a particular topic. In some examples, and in accordance with the present disclosure, a tag is provided, which a user can select to surface additional in-depth article search results for a suggested query that is represented by the tag. Stated more plainly, a tag represents a query that is suggested based on an in-depth article displayed in search results, and the query that the in-depth article is surfaced in response to. The suggested query can be submitted to surface additional in-depth articles. In some examples, the additional in-depth articles correspond to one or more topics of the in-depth article, for which the tag is provided. In this manner, users can use tags to explore other in-depth articles that may have some commonality to an originally surfaced in-depth article, e.g., one or more topics in common.
Sometimes a search is performed by a searcher that might be related to a specific entity or set of entities. The patent provides some information about how different entities might be related, and how an in-depth article might be triggered based upon that relationship. The patent tells us that an entity graph or structured data might be used to identify entities that are associated with a particular query. In-Depth articles might be shown in search results that have to do with topics involving those related topics. The patent doesn’t provide a lot of details on this related entity aspect of In-Depth Articles, except to tell us that associations between different entities may be what triggers the presentation of in-depth articles about specific topics
A look around Google search results for In-Depth Articles isn’t revealing many of them. (I tried to find some non-client examples so that I could take some screenshots and show them off, but it is much harder to find them these days). Additionally, when they do appear, they seem to no-longer be denoted by the ‘In-depth articles’ heading or have any of the visual accompaniments, but rather now just have thin gray lines above/below them. People may not have been clicking upon them in search results with the heading, and Google may have decided to change the look and show them less frequently.
The idea of having an article that is an “In-Depth” article for a topic is a compelling one, especially when Google was displaying those in ways that stood out, like in the following example from the Google Help page about them that shows how they had a headline of “In-Depth Articles” while in search results, and were displayed with a thumbnail image appearing in front of them, and a logo and publisher name appearing prominently next to them.
The language in the patent about some queries triggering in-depth articles reminded me of Google patents about how some queries triggered rich snippets (like including a place name and the word “weather” would show a rich search result.)
I’m not sure that I can say that I had ever seen an In-Depth Article that focused upon a related entity, but I think that is an interesting idea which I would like to see in action.
Are you seeing any In-Depth Articles from Google? The examples the patent gave of detailed and rich articles about University Admissions were interesting articles. I’ve been keeping an eye open for a patent related to Google’s “in the news” search results that Google sometimes shows off in search results, which are often blog posts about new and newsworthy topics that have a headline of “In the News” and often have an image that accompanies them, like this one about David Bowie:
I’m curious about how Google might refer to the sources those are from, and if they call them something like “Stellar Sources”, as they have in this patent for publishers for In-Depth Articles.
The way that Google used anchor text distributions over time to determine an evergreen score for in-Depth articles was interesting. A few people in the SEO Industry have written about evergreen content and have suggested removing post dates from blog posts, which wouldn’t alone show the same kind of continual interest in an article over time that Google is looking for.
I’m wondering if a continued stream of social shares at places like Twitter or Facebook or Google+ might be another way of telling about how evergreen content might be. A process like this in-Depth article approach might evolve, and we may see these start reappearing – they seem like an interesting idea.