Patent · US Active

Multi-domain machine translation model adaptation

US9235567B2 · kind B2 · utility

3Cited by

3References

21Claims

0Family size

Assignee

Xerox Corporation · US

Inventors

Markos Mylonakis · Grenoble, FR
Nicola Cancedda · Grenoble, FR

Key dates

Filing date	Jan 14, 2013
Grant date	Jan 12, 2016
Priority date	—
Expiry date	Jan 20, 2034

Classification

Technology area (CPC G)Physics
CPC primaryG06F40/44
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method adapted to multiple corpora includes training a statistical machine translation model which outputs a score for a candidate translation, in a target language, of a text string in a source language. The training includes learning a weight for each of a set of lexical coverage features that are aggregated in the statistical machine translation model. The lexical coverage features include a lexical coverage feature for each of a plurality of parallel corpora. Each of the lexical coverage features represents a relative number of words of the text string for which the respective parallel corpus contributed a biphrase to the candidate translation. The method may also include learning a weight for each of a plurality of language model features, the language model features comprising one language model feature for each of the domains.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.