Hybrid comparison for unicode text strings consisting primarily of ASCII characters
US10540425B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jun 18, 2019 |
| Grant date | Jan 21, 2020 |
| Priority date | — |
| Expiry date | Jun 18, 2039 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH03M7/705
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method compares text strings having Unicode encoding. The method receives a first string S=s1 s2 . . . sn and a second string T=t1 t2 . . . tm, where s1, s2, . . . , sn and t1, t2, . . . , tm are Unicode characters. The method computes a first string weight for the first string S according to a weight function ƒ. When S consists of ASCII characters, ƒ(S)=S. When S consists of ASCII characters and some accented ASCII characters that are replaceable by ASCII characters, ƒ(S)=g(s1) g(s2) . . . g(sn), where g(si)=si when si is an ASCII character and g(si)=si′ when si is an accented ASCII character that is replaceable by the corresponding ASCII character si′. When S includes one or more non-replaceable non-ASCII characters, the first string weight concatenates an ASCII weight prefix ƒA (S) and a Unicode weight suffix ƒU(S). The method also computes a second string weight for the second text string T. Equality of the strings is tested using the string weights.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.