Generating training data for disambiguation
US9483462B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Aug 7, 2015 |
| Grant date | Nov 1, 2016 |
| Priority date | — |
| Expiry date | Aug 7, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/063
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for generating training data for disambiguation of an entity comprising a word or word string related to a topic to be analyzed includes acquiring sent messages by a user, each including at least one entity in a set of entities; organizing the messages and acquiring sets, each containing messages sent by each user; identifying a set of messages including different entities, greater than or equal to a first threshold value, and identifying a user corresponding to the identified set as a hot user; receiving an instruction indicating an object entity to be disambiguated; determining a likelihood of co-occurrence of each keyword and the object entity in sets of messages sent by hot users; and determining training data for the object entity on the basis of the likelihood of co-occurrence of each keyword and the object entity in the sets of messages sent by the hot users.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.