Patent · US Expired

Method, an apparatus, a system, a storage device, and a computer readable medium using a bilingual database including aligned corpora

US5867811A · kind A · utility

129Cited by
3References
41Claims
0Family size

Assignees

Inventor

Key dates

Filing dateFeb 16, 1995
Grant dateFeb 2, 1999
Priority date
Expiry dateFeb 16, 2015

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/51
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Aligned corpora are generated or received from an external source. Each corpus comprises a set of portions aligned with corresponding portions of the other corpus, for example, so that aligned portions are nominally translations of one another in two languages. A statistical database is compiled. An evaluation module calculates correlation scores for pairs of words chosen one from each corpus. Given a pair of text portions (one in each language) the evaluation module combines word pair correlation scores to obtain an alignment score for the text portions. These alignment scores can be used to verify a translation and/or to modify the aligned corpora to remove improbable alignments.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.