Patent · US Active

Locality-sensitive hashing to clean and normalize text logs

US11244156B1 · kind B1 · utility

0Cited by
5References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 29, 2020
Grant dateFeb 8, 2022
Priority date
Expiry dateOct 29, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F2218/12
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Techniques for improved text normalization are provided. Signatures are generated for a first word and a second word using a locality-sensitive hashing technique. A graph is constructed based on the first and second signatures, by creating a first node in the graph for the first word, creating a second node in the graph for the second word, and creating an edge in the graph connecting the first and second nodes upon determining that the first and second signatures match. A mapping from the first word to the second word is then generated based on the graph.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.