Hybrid approach to collating unicode text strings consisting primarily of ASCII characters
US10089282B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Jan 31, 2018 |
| Grant date | Oct 2, 2018 |
| Priority date | — |
| Expiry date | Jan 31, 2038 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH03M7/705
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Collating text strings having Unicode encoding includes receiving two text strings S=s1s2 . . . sn and T=t1t2 . . . tm. When the two text strings are not identical, there is a smallest positive integer p for which the two text strings differ. The process looks up the characters sp and tp in a predefined lookup table. If either of these characters is missing from the lookup table, the collation of the text strings is determined using the standard Unicode comparison of the text strings spsp+1 . . . sn and tptp+1 . . . tm. Otherwise, the lookup table assigns weights vp and wp for the characters sp and tp. When vp≠wp, these weights define the collation order of the strings S and T. When vp=wp, the collation of S and T is determined recursively using the suffix strings sp+1 . . . sn and tp+1 . . . tm.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.