Patent · US Active

Hybrid comparison for unicode text strings consisting primarily of ASCII characters

US10089281B1 · kind B1 · utility

14Cited by
3References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateSep 28, 2017
Grant dateOct 2, 2018
Priority date
Expiry dateSep 28, 2037

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH03M7/705
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Comparing text strings with Unicode encoding includes receiving two text strings S1 and S2. The process computes, for the first text string S1, a first weight according to a weight function ƒ that computes an ASCII prefix ƒA(S1), computes a Unicode weight suffix ƒU(S1), and concatenates the weights to form the first weight ƒ(S1)=ƒA(S1)+ƒU(S1). Computing the ASCII prefix for the first string applies bitwise operations to n-byte contiguous blocks of the first string to determine whether each block contains only ASCII characters, and replaces accented Unicode characters with equivalent unaccented ASCII characters when comparison is designated as accent-insensitive. When there is a first block containing a non-replaceable non-ASCII character, the Unicode weight suffix is computed by performing a character-by-character Unicode weight lookup beginning with the first block. The same process is applied to the second string. The text string are compared by comparing their computed weights.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.