Google Images with Structured Data Actions and Annotations

Posted @ Jun 16 2017 by

Google Lens

Last year, I wrote about a new patent from Google that described New Visual Search Photo Features from Google. A rumor about new capabilities of Phone apps from Android accompanied a patent that described an object outline recognition search feature built into such apps. When we arrived at Google’s I/O 2017 Developer’s Conference, and Google announced something they were calling Google Lens, it seemed a little like Deja Vu. That rumored feature hasn’t been released yet, and it isn’t quite the same thing though. It’s a little different, and it appears it has a focus on structured data actions that is interesting.

I’ve been keeping an eye out for patents from Google, but I missed one, and the folks at Patently Apple reported on one from the start of the month:

The Patent behind Google’s Augmented Reality Camera Feature called ‘Google Lens’ Pops up at the U.S. Patent Office

After reading that, I thought it was worth taking a closer look at the patent. The patent tells us that it focuses upon taking actions based on content found in images.

The patent is:

Smart Camera User Interface
Inventors: Teresa Ko, Adam Hartwig, Mikkel Crone Koser, Alexei Masterov, Andrews-Junior Kimbembe, Matthew J. Bridges, Paul Chang, David Petrou, and Adam Berenzweig
US Patent Application: 20170155850
Granted: June 1, 2017
Filed: February 9, 2017


Implementations of the present disclosure include actions of receiving image data of an image capturing a scene, receiving data describing one or more entities determined from the scene, the one or more entities being determined from the scene, determining one or more actions based on the one or more entities, each action being provided at least partly based on search results from searching the one or more entities, and providing instructions to display an action interface comprising one or more action elements, each action element being to induce execution of a respective action, the action interface being displayed in a viewfinder.

Actions associated with Images

Example actions that people can take, based upon finding content in images may include:

(1) Sharing content such as images and video,
(2) Purchasing one or more items,
(3) Downloading content such as music, video, or images,
(4) An adding event action, such as adding an event to a calendar, and
(5) An add to album action that can be executed to add content, e.g., images, to an album, e.g., photo album.

The patent uses data in images, as opposed to machine readable codes such a bar codes. It may recognize entities found in images using an entity recognition program. An entity can be a thing instead of just a person or a place, such as a hamburger or food. Actions that may be associated with specific entities may be defined by annotations which are associated with those. The patent tells us:

In some examples, one or more annotations are associated with each entity of the one or more entities. In some examples, the set of entities can be provided to an annotation engine, which processes the set of entities to provide a set of annotations. In some examples, the annotation engine is provided as one or more computer-executable programs that can be executed by one or more computing devices, e.g., the device and/or the server system. In some implementations, the entity recognition engine and the annotation engine are combined, e.g., are the same engine.

One type of annotation that might be associated with an entity that might be recognized in an image would be a search-related action:

For example, an entity can include the text “Best Band Ever,” which is depicted in the image data, and which is the name of a band of musicians. In some examples, the text “Best Band Ever” can be provided as a search query to the search engine, and search results can be provided, which are associated with the particular band. Example search results can include tour dates, albums, and/or merchandise associated with the band, which search results can be provided as annotations.

Structured Data Actions and Images

The patent points out the possibility that an annotation “can be provided based on cross-referencing entities with a structured data graph. e.g., knowledge graph.”

The patent provides three examples of how actions might be provided when they are mapped to entities or annotations:

(1) For example, an entity and/or annotation that is associated with an event, e.g., a concert, can be mapped to an add event action that can be executed to add an event to a calendar, and/or to a purchase action, e.g., to purchase tickets to the event, purchase albums. Consequently, the add event action and/or the purchase action can be included in the one or more events.

(2) As another example, an entity and/or annotation can correspond to an image album of the user, e.g., a Food album, and can be mapped to an add to album action that can be executed to add content, e.g., image, to an album. Accordingly, the action is provided based on user-specific information, e.g., the knowledge that the user has an image album relevant to the entity depicted in the image.

(3) As another example, an entity and/or annotation can correspond to one or more contacts of the user, e.g., within a social networking service, and can be mapped to a share image action that can be executed to share the image with the contacts. Accordingly, the action is provided based on user-specific information, e.g., the knowledge that the user typically shares content depicted in the image with the particular contacts.

The patent also points out some other examples.

A Book captured in image data may return structured data information associated with that book, such as: image of the cover, title, summary, author, publication date, genre.

A Band represented in image data ay return structured data information associated with the band, such as: picture of the band, a list of band members, a list of albums.

Take Aways

The Google I/O presentation showed information about a business being returned in response to a photo being taken of that particular business. That sounds like it is returning data from an annotation based upon knowledge graph information about the business. Google doesn’t promise that if your business is verified in Google MyBusiness, it will automatically gain a knowledge graph of your business in search results. But it does seem to help in many instances. I imagine that Google will likely publish more about how to set up structured data to have specific annotations associated with different entities. It does appear to be a sign that Google will be finding ways to use structured data from web pages that people may not have been anticipating, such as the similar items image search results introduced within the past couple of months.

Leave a Comment