Patent · US Active

Systems and methods for identifying parallel documents and sentence fragments in multilingual document collections

US8943080B2 · kind B2 · utility

11Cited by
228References
19Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 5, 2006
Grant dateJan 27, 2015
Priority date
Expiry dateJun 13, 2030

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/45
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems, computer programs, and methods for identifying parallel documents and/or fragments in a bilingual collection are provided. The method for identifying parallel sub-sentential fragments in a bilingual collection comprises translating a source document from a bilingual collection. The method further includes querying a target library associated with the bilingual collection using the translated source document, and identifying one or more target documents based on the query. Subsequently, a source sentence associated with the source document is aligned to one or more target sentences associated with the one or more target documents. Finally, the method includes determining whether a source fragment associated with the source sentence comprises a parallel translation of a target fragment associated with the one or more target sentences.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.