Identifying language translations for source documents using links
US8271869B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Oct 8, 2010 |
| Grant date | Sep 18, 2012 |
| Priority date | — |
| Expiry date | Mar 1, 2031 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/263
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Technology is described for identifying language translations for source documents. The method includes finding source documents containing links to target documents and the link anchors of the links have language indicating text. A first tuple set can be generated for paired source documents and target documents with an expected target language for a target document. The first tuple set can be annotated with primary languages for the source documents and target documents to form a second tuple set where primary languages of the source documents and target documents are different. Further, a third tuple set can be generated using the second tuple set using a count of the number of times source documents and target documents occur in the first tuple set. Tuples can be removed from the third tuple set where a count ratio between source document count and target document count is less than a reference ratio.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.