Automatic annotation for training and evaluation of semantic analysis engines
US9224103B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 13, 2013 |
| Grant date | Dec 29, 2015 |
| Priority date | — |
| Expiry date | Nov 30, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N20/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Implementations include systems and methods generate data for training or evaluating semantic analysis engines. For example, a method may include receiving documents from a corpus that includes an authoritative set of documents from an authoritative source. Each document in the authoritative set may be associated with an entity. A second set of documents from the corpus that do not overlap with the first set may include at least one link to a document in the authoritative set, the at least one link being associated with anchor text. For each document in the second set, the method may include identifying entity mentions in the document based on the anchor text. The method may include associating the entity mention with the entity in a graph-structured knowledge base or associating entity types with the entity mention. The method may also include training a semantic analysis engine using the identified entity mentions and associations.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.