Patent · US Active

Methods for using manual phrase alignment data to generate translation models for statistical machine translation

US8229728B2 · kind B2 · utility

9Cited by

2References

30Claims

0Family size

Assignee

Fluential, LLC · US

Inventors

Jun-Yao Huang · Sanwan, TW
Yookyung Kim · Los Altos, US
Demitrios Master · Cupertino, US
Farzad Ehsani · Sunnyvale, US

Key dates

Filing date	Jan 4, 2008
Grant date	Jul 24, 2012
Priority date	—
Expiry date	Jun 2, 2030

Classification

Technology area (CPC G)Physics
CPC primaryG06F40/45
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

The present invention adopts the fundamental architecture of a statistical machine translation system which utilizes statistical models learned from the training data and does not require expert knowledge for rule-based machine translation systems. Out of the training parallel data, a certain amount of sentence pairs are selected for manual alignment. These sentences are aligned at the phrase level instead of at the word level. Depending on the size of the training data, the optimal amount for manual alignment may vary. The alignment is done using an alignment tool with a graphical user interface which is convenient and intuitive to the users. Manually aligned data are then utilized to improve the automatic word alignment component. Model combination methods are also introduced to improve the accuracy and the coverage of statistical models for the task of statistical machine translation.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.