Patent · US Active

Generating training data for disambiguation

US9483462B2 · kind B2 · utility

3Cited by
0References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 7, 2015
Grant dateNov 1, 2016
Priority date
Expiry dateAug 7, 2035

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/063
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for generating training data for disambiguation of an entity comprising a word or word string related to a topic to be analyzed includes acquiring sent messages by a user, each including at least one entity in a set of entities; organizing the messages and acquiring sets, each containing messages sent by each user; identifying a set of messages including different entities, greater than or equal to a first threshold value, and identifying a user corresponding to the identified set as a hot user; receiving an instruction indicating an object entity to be disambiguated; determining a likelihood of co-occurrence of each keyword and the object entity in sets of messages sent by hot users; and determining training data for the object entity on the basis of the likelihood of co-occurrence of each keyword and the object entity in the sets of messages sent by the hot users.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.