Using Semantic Frames to Add Context to Word Embeddings
Word Embeddings are a way for Google to look at text, whether a short tweet or query, or a page, or a site, and understand the words in those better. It can understand when a word or a sentence could be added, which is how query rewriting under something like Rankbrain takes place. But the Word Embedding approach doesn’t understand the context of words, like the difference between a river bank or withdrawing money from a bank account. So, Google has been working on exploring ways to pre-train text, so that not only can this natural language processing approach understand what might be missing, but possibly so that contexts and meanings of words can be better understood.
The words we speak can be described as fitting into semantic frames that give them meaning. Same is true with the words that we use on a Website. If a pre-training approach can associate possible semantic frames that words might fit into, that may make this kind of natural language processing more effective.
More About Semantic Frames
When I worked for the largest trial court in the State of Delaware, there were terms that we used that everyone working in the Court knew the meaning of, but weren’t words that most people would see in normal conversations, such as capias (a bench warrant issued by a judge) or Nolle Pros’d (a notice of nolle prosequi filed by a deputy attorney general stating that they decided not to prosecute a charge that had been indicted or brought on a warrant by a police officer.) These words can mean that someone may end up being locked up, or released from jail or a prison, and are part of the everyday framework of language for people who work in a court system. When those words are explained in the context of a frame, such as the Criminal Justice System, they gain a lot of meaning.
Frame Semantics has been part of something known as computational linguistics for over 20 years. It appears to be something that Google will be working into some of the more recent technology that they have been coming up with, like the Word Vectors, or word embeddings that are behind technology such as their RankBrain Update. Before I talk about a Google patent that introduces that, I think it’s important and essential to look more at what Semantic Frames are, and how they work.
Here’s a definition of Semantic Frames to give this post more preciseness:
Thus, conceptual structures (called semantic frames) provide a background of beliefs and experiences that are necessary to interpret the lexical meaning of the word in question.
The criminal justice system I start this post off with is a conceptual frame that gives words such as capias and Nolle Pros’d meaning. Without having been in that world, I wouldn’t understand them. I also wouldn’t know what the difference between prison and jail was either, and a jail is a place where someone is held before they may have been tried and convicted of a criminal offense, and a prison is where they are sent after a trial and sentencing. When someone uses either word and means the other, I know that they haven’t worked in the criminal justice system; that frame is outside of their experience.
There is a project at the University of California at Berkeley, where people with experience in Semantic Frames are building knowledge about those frames in something called the FrameNet project. On the About Framenet pages, we can find out more about how Frame Semantics work.
Learning about the FrameNet Project at Berkeley is a good starting point for this post. This video provides a good sense of the work behind the project.
The FrameNet Database
I don’t usually start a blog post with a video, or even two of them, but I also watched and enjoyed this video explaining what frame semantics are, and thought that sharing this one too, would make the process I am describing in this post make a lot more sense:
A course in Cognitive Linguistics: Frame Semantics
The video presents a number of frame semantics to help explain how a word derives meaning from the frame it appears within.
To provide more technical background, the founder of the FrameNet project at Berkeley wrote the following paper: Frame Semantics for Text Understanding. Spending time at the website of the Berkeley FrameNet Project can also be helpful.
In addition to looking at the FrameNet project pages, it is rare seeing a Google Patent in which the inventors behind a patent have written a whitepaper on the same topic. I’ve seen this done with many patents from Microsoft, but only a handful from Google. In this case, there is one that is worth spending some time with. The paper is Semantic Frame Identification with Distributed Word Representations.
Websites that focus upon a specific industry or affiliation, such as construction or cooking or music may use words that are very relevant to the dialog that might be common among people engaged in those fields.
I explained how working at Delaware Courts gave me an understanding of words that were commonly used at a courthouse that often meant a difference between people being locked up or released from prison, but which most people wouldn’t understand. In the video on cognitive Linguistics we were told about the frame of someone waiting on customers in a restaurant, and how words come from that frame. The frame of commercial buying was also mentioned and is illustrated in a screenshot from the Google Patent. We are shown how such language might be annotated under that frame:
While this newly granted patent from Google tells us about how semantic frames may be used with word embeddings, Google also has a patent about Word Embeddings, invented by members of the Google Brain Team, which I wrote about in the post, Citations behind the Google Brain Word Vector Approach.
Google’s Definition of Semantic Frames
Many patents are filled with definitions, and this new one from Google is no different. While I have provided some examples and a definition of what semantic frames are, and a couple of videos about it, looking at Google’s definition from the patent is worth doing because they provide context for how they may be used in the process that their patent is about. Here is how they define semantic frames:
Linguistic semantics focuses on the history of how words have been used in the past. Frame semantics is a theory of language meaning that relates linguistic utterances to word knowledge, such as event types and their participants. A semantic frame refers to a collection of facts or a coherent structure of related concepts that specify features (attributes, functions, interactions, etc.) that are typically associated with the specific word. One example semantic frame is the situation of a commercial transfer or transaction, which can involve a seller, a buyer, goods, and other related things.
So, this use of Semantic Frames with Word Embeddings is interesting, and the look at Semantic Frames was one worth taking. The summary section of the description of this patent shows us how frames might be useful:
A computer-implemented technique is presented. The technique can include receiving, at a server having one or more processors, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings. The technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word. The technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space. The technique can also include outputting, by the server, the model for storage, the model is configured to identify a specific semantic frame for an input.
This differs from the Word embeddings approach that Google patented because it includes a context behind word embeddings as learned in Semantic Frames.
That is brought out more clearly in an alternative embodiment of the patent, also described in the summary section of the patent:
In other embodiments, the labeled training data includes (i) frames for verbs and (ii) possible semantic roles for each frame, and modifier roles in the labeled training data are shared across different frames.
The summary also points at the possibility of this approach being useful in question answering in response to spoken queries:
In other embodiments, the technique further includes: receiving, at the server, speech input representing a question, converting, at the server, the speech input to a text, analyzing, at the server, the text using the model, and generating and outputting, by the server, an answer to the question based on the analyzing of the text using the model.
Would this approach be useful in translation from one language to another? The patent summary tells us a variation of the process described within this patent could be used in that manner:
In some embodiments, the technique further includes: receiving, at the server, a text to be translated from a source language to a target language, the source language being a same language as a language associated with the model, analyzing, at the server, the text using the model, and generating and outputting, by the server, a translation of the text from the source language to the target language based on the analyzing of the text using the model.
Unsurprisingly, the summary of the patent also points to the possibility that the process behind the patent could be used in a way that could lead to returning search results:
In some embodiments, the operations further include: indexing a plurality of web pages using the model to obtain an indexed plurality of web pages and utilizing the indexed plurality of web pages to provide search results in response to a search query.
The Semantic Frames patent can be found at:
Semantic frame identification with distributed word representations
Inventors: Dipanjan Das, Kuzman Ganchev, Jason Weston, and Karl Moritz Hermann
Assignee: Google LLC
US Patent: 10,289,952
Granted: May 14, 2019
Filed: January 28, 2016
A computer-implemented technique can include receiving, at a server, labeled training data including a plurality of groups of words, each group of words having a predicate word, each word having generic word embeddings. The technique can include extracting, at the server, the plurality of groups of words in a syntactic context of their predicate words. The technique can include concatenating, at the server, the generic word embeddings to create a high dimensional vector space representing features for each word. The technique can include obtaining, at the server, a model having a learned mapping from the high dimensional vector space to a low dimensional vector space and learned embeddings for each possible semantic frame in the low dimensional vector space. The technique can also include outputting, by the server, the model for storage, the model is configured to identify a specific semantic frame for an input.
Is Google in a post Semantic Frames time?
Google is continuing research on Semantic Frames, as seen by this recent paper, from April 12, 2019, as well:
The abstract for this paper explains why the research behind it was interesting:
We present a resource for the task of FrameNet semantic frame disambiguation of over 5,000 word-sentence pairs from the Wikipedia corpus. The annotations were collected using a novel crowdsourcing approach with multiple workers per sentence to capture interannotator disagreement. In contrast to the typical approach of attributing the best single frame to each word, we provide a list of frames with disagreement-based scores that express the confidence with which each frame applies to the word. This is based on the idea that inter-annotator disagreement is at least partly caused by the ambiguity that is inherent to the text and frames. We have found many examples where the semantics of individual frames overlap sufficiently to make them acceptable alternatives for interpreting a sentence. We have argued that ignoring this ambiguity creates an overly arbitrary target for training and evaluating natural language processing systems – if humans cannot agree, why would we expect the correct answer from a machine to be any different? To process this data we also utilized an expanded lemma-set provided by the Framester system, which merges FN with WordNet to enhance coverage. Our dataset includes annotations of 1,000 sentence-word pairs whose lemmas are not part of FN. Finally, we present metrics for evaluating frame disambiguation systems that account for ambiguit
But, we can’t be sure how much effort Google will give to this Semantic Frames approach, because other approaches that have been shown to be worth investigating have also shown up.
An even newer paper that the Semantic Frames one I just linked to was published on May 15, 2019.
The abstract from the paper:
Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We focus on one such model, BERT, and aim to quantify where linguistic information is captured within the network. We find that the model represents the steps of the traditional NLP pipeline in an interpretable and localizable way and that the regions responsible for each step appear in the expected sequence: POS tagging, parsing, NER, semantic roles, then coreference. Qualitative analysis reveals that the model can and often does adjust this pipeline dynamically, revising lowerlevel decisions on the basis of disambiguating information from higher-level representations.
I’ve summarized the summary of this patent, but looking at it, and what has come after it, it might be worth skipping ahead in time, to see some of the other things that Google is working upon. The detailed description of this patent provides more details about how it works, however one of the inventors of this Semantic Frames patent, and author of the related white paper (Dipanjan Das) is an author of a more recent paper at Google around BERT as well, which appears to be creating a buzz around the Search industry (the classical NLP Pipeline paper I linked to above.) The semantic frames patent is an updated continuation patent for a patent that was originally filed on May 7, 2014. Knowing about semantic frames and how it could potentially be used is helpful, especially understanding how it aims at giving context to words being processed.
But there is also BERT, which is worth looking at, and how it pre-trains text, too:
BERT can be found on Github: https://github.com/google-research/bert
So, understanding how Google is doing natural language processing can be helpful, as well as the role that Semantic Frames could play in doing that, but it appears to be something that BERT can do too.