Go Fish Digital

(703) 596-1353

Good Bye Knowledge Graph, Hello Google Knowledge Vault?

By Bill Slawski on Aug 25, 2014 10

Last week I retweeted an article I saw in New Scientist, a magazine that doesn’t usually offer insights into things happening at Google. The title of the article caught my eye, Google’s fact-checking bots build vast knowledge bank. I think I noticed it in part because I was spending the week in San Jose, co-presenting a tutorial on the Semantic Web at the Semantic Technology and Business Conference (#SemTechBiz2014). I thought that was pretty timely.

The KDD 2014 Conference Home Page

This morning, at the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Google is expected to announce that they are trying a new approach with a knowledge base, applying a mix of technologies to automate a lot of the collection of information that help in:

  • Fueling efforts at Google to present more and better knowledge panel results
  • Helping Google recognize entities (specific people, place and things) within queries
  • Making predictive algorithms for personal assistants such as Google Now & Siri & Cortana smarter and more assistive
  • Bringing us new applications and uses that haven’t been discovered yet

The announcement carries with it a lot of new information about approaches Google follows to deliver more information to people faster, in a more complete form. Before you delve too quickly into this topic, reading the following paper (PDF) from Google is highly recommended:

Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion.

At the Semantic Web Conference, a common theme in presentations from search engineers such as Google Fellow Ramanathan Guha was the importance of getting knowledge base information to scale better and help applications from rich snippets, to knowledge panels, to mobile apps, and others. These many different kinds of applications need to be able to use such information, and the more information that is available, the more effective they will be.

For example, there have been medical studies conducted by organizations for the US Government where the conclusions have been online, but the data itself from the studies haven’t been shared. It’s a shame that knowledge hasn’t been shared in a way that could potentially help others.

Oddly, the new Knowledge Vault doesn’t have as much information as Google’s Freebase knowledge base. Freebase has 637M (non-redundant) facts, and the Knowledge Vault has only 302M “confident” facts. Of those only 223M are in Freebase. There are lots of missing values for facts in Freebase entities, with 71% of people in Freebase having no known place of birth, and 75% having no known nationality. But the increase in “confidence” of facts included in the Knowledge Vault is significant.

Sometimes there are some problems that make creating a Freebase entry difficult.

A Freebase Entry for a Biblical Figure using Bible-based facts.

Google’s been working upon this automated approach to building the Knowledge Vault with a greater confidence in the facts it includes, using methods such as:

A detailed presentation for KDD titled Constructing and Mining Web-scale Knowledge Graphs is worth downloading and studying if you want more details.

With this faster automated approach to building and representing knowledge by Google with the Knowledge Vault, we’re prety excited by how much additional information this can provide to searchers, and how many additional opportunities it can bring to site owners.

10 Comments

  1. Jan-Willem Bobbink

    August 25th, 2014 at 3:38 pm

    You made a typo: “Of those only 302M are in Freebase.” should be “Of those, 223M are in Freebase” :)

    Reply

  2. Bill Slawski

    August 25th, 2014 at 4:14 pm

    Hi Jan-Willem.

    Thanks for pointing that out.

    I wanted to really stress that a lot of the facts in Freebase didn’t make it into the Knowledge Vault, so it’s not as if Google just took Freebase and other knowledge base information from elsewhere and just kept it and stitched it together somehow, as I saw someone else report about the Knowledge Vault. :(

    Reply

  3. Amit Roy

    August 26th, 2014 at 2:18 am

    That info was out of the world Bill. Really commendable. I have downloaded the paper and also went through the Newscientist blog. Really feeling excited with such inputs to come in near future.

    Reply

    • Bill Slawski

      August 26th, 2014 at 2:43 am

      Thanks, Amit,

      I was really excited to see the many things Google was trying involving the Knowledge Vault, too. The near future looks interesting. :)

  4. Oscar

    August 26th, 2014 at 12:23 pm

    I see everything on your site except the text of this blogpost. I only see the image.

    After I refresh a couple of times I see the text, when I refresh again a couple of times it goes away.

    You should look at it.

    Reply

    • Bill Slawski

      August 27th, 2014 at 7:14 am

      Hi Oscar

      Thanks for the heads up. I will have someone check into it.

  5. Pingback: Official: Google Kills Authorship

  6. Pingback: End of Google Authorship – Forbes | E-book Readers Mart

  7. Pingback: Was Google Maps a Proof of Concept for Google's Knowledge Base Efforts?

  8. Pingback: Hold Your Horses: The Knowledge Vault Is Just A Research Project For Now

Leave a Comment