Patent · US Active

Simhash based spell correction

US8661341B1 · kind B1 · utility

3Cited by
13References
16Claims
0Family size

Assignee

Inventor

Key dates

Filing dateJan 19, 2011
Grant dateFeb 25, 2014
Priority date
Expiry dateSep 11, 2031

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F40/232
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Methods, systems, and apparatus for performing simhash based spell correction are provided. A character string is simhashed to generate a simhashed character string. A plurality of substrings is extracted from the character string by applying a sliding window of at least two characters to the character string. The plurality of substrings are hashed to produce a plurality of corresponding hash values. Each hash value is processed to generate a simhashed character string. The simhashed character string is then compared with character strings within a simhashed dictionary dataset to determine at least one candidate to replace the character string. Processing each hash value includes extracting a set of lowest bits from each hash value, and mapping each set of lowest bits to the bitmask.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.