Method, an apparatus, a system, a storage device, and a computer readable medium using a bilingual database including aligned corpora
US5867811A · kind A · utility
Assignees
Inventor
Key dates
| Filing date | Feb 16, 1995 |
| Grant date | Feb 2, 1999 |
| Priority date | — |
| Expiry date | Feb 16, 2015 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/51
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Aligned corpora are generated or received from an external source. Each corpus comprises a set of portions aligned with corresponding portions of the other corpus, for example, so that aligned portions are nominally translations of one another in two languages. A statistical database is compiled. An evaluation module calculates correlation scores for pairs of words chosen one from each corpus. Given a pair of text portions (one in each language) the evaluation module combines word pair correlation scores to obtain an alignment score for the text portions. These alignment scores can be used to verify a translation and/or to modify the aligned corpora to remove improbable alignments.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.