Patent · US Active

Generation of text classifier training data

US10853580B1 · kind B1 · utility

7Cited by
0References
26Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 30, 2019
Grant dateDec 1, 2020
Priority date
Expiry dateOct 30, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/045
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method includes receiving input designating a term of interest in a document of a document corpus and determining a target context embedding representing a target word group that includes the term of interest and context words located in the document proximate to the term of interest. The method also includes identifying, from among the document corpus, a first candidate word group that is semantically similar to the target word group and a second candidate word group that is semantically dissimilar to the target word group. The method further includes receiving user input identifying at least a portion of the first candidate word group as associated with a first label and identifying at least a portion of the second candidate word group as not associated with the first label. The method also includes generating labeled training data based on the user input to train a text classifier.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.