Patent · US Active

Scalable approach to information-theoretic string similarity using a guaranteed rank threshold

US10482128B2 · kind B2 · utility

0Cited by
15References
18Claims
0Family size

Assignee

Inventor

Key dates

Filing dateMay 15, 2017
Grant dateNov 19, 2019
Priority date
Expiry dateJan 31, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06F16/951
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A string analysis tool for calculating a similarity metric between an input string and a plurality of strings in a collection to be searched. The string analysis tool may include optimizations that may reduce the number of calculations to be carried out when calculating the similarity metric for large volumes of data. In this regard, the string analysis tool may represent strings as features. As such, analysis may be performed relative to features (e.g., of either the input string or plurality of strings to be searched) such that features from the strings may be eliminated from consideration when identifying candidate strings from the collection for which a similarity metric is to be calculated. The elimination of features may be based on a minimum similarity metric threshold, wherein features that are incapable of contributing to a similarity metric above the minimum similarity metric threshold are eliminated from consideration.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.