Dissecting The Analects: an NLP-based exploration of semantic similarities and differences across English translations Humanities and Social Sciences Communications

Home Ai News Dissecting The Analects: an NLP-based exploration of semantic similarities and differences across English translations Humanities and Social Sciences Communications

A Survey of Semantic Analysis Approaches SpringerLink

nlp semantics

Watson’s translation also records a substantially higher percentage (34%) within the 95–100% range compared to other translators. “Annotating lexically entailed subevents for textual inference tasks,” in Twenty-Third International Flairs Conference (Daytona Beach, FL), 204–209. We have organized the predicate inventory into a series of taxonomies and clusters according to shared aspectual behavior and semantics. These structures allow us to demonstrate external relationships between predicates, such as granularity and valency differences, and in turn, we can now demonstrate inter-class relationships that were previously only implicit. Like the classic VerbNet representations, we use E to indicate a state that holds throughout an event.

nlp semantics

This type of structure made it impossible to be explicit about the opposition between an entity’s initial state and its final state. It also made the job of tracking participants across subevents much more difficult for NLP applications. Understanding that the statement ‘John dried the clothes’ entailed that the clothes began in a wet state would require that systems infer the initial state of the clothes from our representation. By including that initial state in the representation explicitly, we eliminate the need for real-world knowledge or inference, an NLU task that is notoriously difficult. In order to accommodate such inferences, the event itself needs to have substructure, a topic we now turn to in the next section. In the rest of this article, we review the relevant background on Generative Lexicon (GL) and VerbNet, and explain our method for using GL’s theory of subevent structure to improve VerbNet’s semantic representations.

Machine Translation and Attention

The stems for “say,” “says,” and “saying” are all “say,” while the lemmas from Wordnet are “say,” “say,” and “saying.” To get these lemma, lemmatizers are generally corpus-based. This is because stemming attempts to compare related words and break down words into their smallest possible parts, even if that part is not a word itself. There are multiple stemming algorithms, and the most popular is the Porter Stemming Algorithm, which has been around since the 1980s. Stemming breaks a word down to its “stem,” or other variants of the word it is based on.

nlp semantics

However, we did find commonalities in smaller groups of these classes and could develop representations consistent with the structure we had established. Many of these classes had used unique predicates that applied to only one class. We attempted to replace these with combinations of predicates we had developed for other classes nlp semantics or to reuse these predicates in related classes we found. This degree of language understanding can help companies automate even the most complex language-intensive processes and, in doing so, transform the way they do business. So the question is, why settle for an educated guess when you can rely on actual knowledge?

How NLP & NLU Work For Semantic Search

German speakers, for example, can merge words (more accurately “morphemes,” but close enough) together to form a larger word. The German word for “dog house” is “Hundehütte,” which contains the words for both “dog” (“Hund”) and “house” (“Hütte”). This step is necessary because word order does not need to be exactly the same between the query and the document text, except when a searcher wraps the query in quotes.

nlp semantics

We propose to incorporate explicit lexical and concept-level semantics from knowledge bases to improve inference accuracy. We conduct an extensive evaluation of four models using different sentence encoders, including continuous bag-of-words, convolutional neural network, recurrent neural network, and the transformer model. Experimental results demonstrate that semantics-aware neural models give better accuracy than those without semantics information. On average of the three strong models, our semantic-aware approach improves natural language inference in different languages.

Leave a Reply

Your email address will not be published. Required fields are marked *