Good Bye Knowledge Graph, Hello Google Knowledge Vault?

by Posted @ Aug 25 2014

Twitter

Last week I retweeted an article I saw in New Scientist, a magazine that doesn’t usually offer insights into things happening at Google. The title of the article caught my eye, Google’s fact-checking bots build vast knowledge bank. I think I noticed it in part because I was spending the week in San Jose, co-presenting a tutorial on the Semantic Web at the Semantic Technology and Business Conference (#SemTechBiz2014). I thought that was pretty timely.

The KDD 2014 Conference Home Page

This morning, at the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Google is expected to announce that they are trying a new approach with a knowledge base, applying a mix of technologies to automate a lot of the collection of information that helps in:

  • Fueling efforts at Google to present more and better knowledge panel results
  • Helping Google recognize entities (specific people, place and things) within queries
  • Making predictive algorithms for personal assistants such as Google Now & Siri & Cortana smarter and more assistive
  • Bringing us new applications and uses that haven’t been discovered yet

The announcement carries with it a lot of new information about approaches Google follows to deliver more information to people faster, in a more complete form. Before you delve too quickly into this topic, reading the following paper (PDF) from Google is highly recommended:

Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion.

At the Semantic Web Conference, a common theme in presentations from search engineers such as Google Fellow Ramanathan Guha was the importance of getting knowledge base information to scale better and help applications from rich snippets to knowledge panels, to mobile apps, and others. These many different kinds of applications need to be able to use such information, and the more information that is available, the more effective they will be.

For example, there have been medical studies conducted by organizations for the US Government where the conclusions have been online, but the data itself from the studies haven’t been shared. It’s a shame that knowledge hasn’t been shared in a way that could potentially help others.

Oddly, the new Knowledge Vault doesn’t have as much information as Google’s Freebase knowledge base. Freebase has 637M (non-redundant) facts, and the Knowledge Vault has only 302M “confident” facts. Of that only 223M are in Freebase. There are lots of missing values for facts in Freebase entities, with 71% of people in Freebase having no known place of birth, and 75% having no known nationality. But the increase in the “confidence” of facts included in the Knowledge Vault is significant.

Sometimes some problems make creating a Freebase entry difficult.

A Freebase Entry for a Biblical Figure using Bible-based facts.

Google’s been working upon this automated approach to building the Knowledge Vault with greater confidence in the facts it includes, using methods such as:

A detailed presentation for KDD titled Constructing and Mining Web-scale Knowl edge Graphs is worth downloading and studying if you want more details.

With this faster-automated approach to building and representing knowledge by Google with the Knowledge Vault, we’re pretty excited by how much additional information this can provide to searchers, and how many additional opportunities it can bring to site owners.

subscribe to our newsletter

17 Comments

  1. Pingback: Official: Google Kills Authorship

  2. Pingback: End of Google Authorship – Forbes | E-book Readers Mart

  3. Pingback: Was Google Maps a Proof of Concept for Google's Knowledge Base Efforts?

  4. Pingback: Hold Your Horses: The Knowledge Vault Is Just A Research Project For Now

  5. Pingback: Good Bye Knowledge Graph, Hello Google Knowledg...

  6. Pingback: SEO Semantica: What's up Tuesday #9 #SemanticSEOWut

  7. Pingback: Google's Browseable Fact Repository - an Early Knowledge Graph

Leave a Comment