Transliteration pair matching
US9176936B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 28, 2012 |
| Grant date | Nov 3, 2015 |
| Priority date | — |
| Expiry date | Aug 25, 2033 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06F40/232
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Feature sequences are extracted, as individual letters separated by spaces, from a digital representation of a proper name in a first language to obtain a first orthographic feature sequence set; and from a digital representation of a proper name in a second language to obtain a second orthographic feature sequence set. The first and second orthographic feature sequence sets (a transliteration pair) are compared to determine a similarity score, based on a similarity model including a plurality of conditional probabilities of known orthographic feature sequences in the first language given known orthographic feature sequences in the second language and a plurality of conditional probabilities of known orthographic feature sequences in the second language given known orthographic feature sequences in the first language. Based on at least one threshold value, it is determined whether the transliteration pair belong to an identical actual proper name.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.