Patent · US Active

Discovery of parallel text portions in comparable collections of corpora and training using comparable texts

US8296127B2 · kind B2 · utility

60Cited by
190References
29Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 22, 2005
Grant dateOct 23, 2012
Priority date
Expiry dateDec 16, 2029

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/42
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A translation training device which extracts from two nonparallel Corpora a set of parallel sentences. The system finds parameters between different sentences or phrases, in order to find parallel sentences. The parallel sentences are then used for training a data-driven machine translation system. The process can be applied repetitively until sufficient data is collected or until the performance of the translation system stops improving.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.