Patent · US Active

Bootstrapping text classifiers by language adaptation

US8521507B2 · kind B2 · utility

5Cited by
7References
17Claims
0Family size

Assignee

Inventors

Key dates

Filing dateFeb 22, 2010
Grant dateAug 27, 2013
Priority date
Expiry dateJul 9, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/35
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Training data in one language is leveraged to develop classifiers for multiple languages under circumstances where all of those classifiers will be performing the same kind of classification task, but relative to linguistically different sets of texts, thereby saving the cost of manually labeling a different set of training data for each language. Classification knowledge is learned for a source language in which training data are available. That knowledge is transferred to another target language's classifier through the integration of language transition knowledge. The transferred model is adjusted to better fit the target language. In one technique, leveraging one language's classification knowledge in order to generate a classifiers for another language involves training a text classifier in a source language, transferring the learned classification knowledge from the source language to another target language using language translation techniques, and further tuning the transferred model to better fit the target language text.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.