Patent · US Active

Bootstrapping text classifiers by language adaptation

US8521507B2 · kind B2 · utility

5Cited by

7References

17Claims

0Family size

Assignee

YAHOO HOLDINGS, INC. · US

Inventors

Lei Shi · Beijing, CN
Mingjun Tian · Beijing, CN

Key dates

Filing date	Feb 22, 2010
Grant date	Aug 27, 2013
Priority date	—
Expiry date	Jul 9, 2031

Classification

Technology area (CPC G)Physics
CPC primaryG06F16/35
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Training data in one language is leveraged to develop classifiers for multiple languages under circumstances where all of those classifiers will be performing the same kind of classification task, but relative to linguistically different sets of texts, thereby saving the cost of manually labeling a different set of training data for each language. Classification knowledge is learned for a source language in which training data are available. That knowledge is transferred to another target language's classifier through the integration of language transition knowledge. The transferred model is adjusted to better fit the target language. In one technique, leveraging one language's classification knowledge in order to generate a classifiers for another language involves training a text classifier in a source language, transferring the learned classification knowledge from the source language to another target language using language translation techniques, and further tuning the transferred model to better fit the target language text.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.