Hybrid approach to collating unicode text strings consisting primarily of ASCII characters
US10325010B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 18, 2018 |
| Grant date | Jun 18, 2019 |
| Priority date | — |
| Expiry date | Sep 18, 2038 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH03M7/705
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Collating text strings having Unicode encoding includes receiving two text strings S=s1s2 . . . s and T=t1t2 . . . tm. When the two text strings are not identical, there is a smallest positive integer p for which the two text strings differ. The process looks up the characters sp and tp in a predefined lookup table. If either of these characters is missing from the lookup table, the collation of the text strings is determined using the standard Unicode comparison of the text strings spsp+1 . . . sn and tptp+1 . . . tm. Otherwise, the lookup table assigns weights vp and wp for the characters sp and tp. When vp≠wp, these weights define the collation order of the strings S and T. When vp=wp, the collation of S and T is determined recursively using the suffix strings sp+1 . . . sn and tp+1 . . . tm.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.