Patent · US Active

Multi-domain machine translation model adaptation

US9235567B2 · kind B2 · utility

3Cited by
3References
21Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 14, 2013
Grant dateJan 12, 2016
Priority date
Expiry dateJan 20, 2034

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/44
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method adapted to multiple corpora includes training a statistical machine translation model which outputs a score for a candidate translation, in a target language, of a text string in a source language. The training includes learning a weight for each of a set of lexical coverage features that are aggregated in the statistical machine translation model. The lexical coverage features include a lexical coverage feature for each of a plurality of parallel corpora. Each of the lexical coverage features represents a relative number of words of the text string for which the respective parallel corpus contributed a biphrase to the candidate translation. The method may also include learning a weight for each of a plurality of language model features, the language model features comprising one language model feature for each of the domains.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.