Patent · US Active

System and method for linguistic collation

US7941311B2 · kind B2 · utility

3Cited by
20References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 22, 2003
Grant dateMay 10, 2011
Priority date
Expiry dateOct 16, 2026

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/12
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method is provided for handling the collation of linguistic symbols of different languages that may have various types of compressions (e.g., from 2-to-1 to 8-to-1). A symbol table of the symbols identified as Unicode code points is generated, with each symbol tagged with a highest compression type of that symbol by sorting the compression tables of the various languages. During a sorting operation with respect to a given string, the tag of a symbol in the string is checked to identify the highest compression type of compressions beginning with that symbol, and the compression tables for the language with compression types equal or lower than the highest compression type of the symbol are searched using a binary search method to find a matching compression for the symbols in the string. A common search module is used to perform binary searches through compression tables of different compression types.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.