Patent · US Active

De-duplicating transaction records using targeted fuzzy matching

US12287767B2 · kind B2 · utility

0Cited by
4References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 30, 2024
Grant dateApr 29, 2025
Priority date
Expiry dateJan 30, 2044

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/412
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer-implemented method is disclosed. The method includes obtaining, by a de-duplication server, a candidate pair of a plurality of digitally stored documents from a document database. Text elements are identified from each digitally stored document in the candidate pair in response, and the text elements are stored as document extraction attributes. The method then automatically computes and stores relative positional differences of the text elements between each digitally stored document of the candidate pair and a document similarity score based on the relative positional differences. The relative positional differences are compared with a similarity function to form a difference similarity vector for the candidate pair. The difference similarity vector comprises components corresponding to each relative positional difference. The components of the difference similarity vector are aggregated to determine a final score for the candidate pair. A document-level similarity metric is determined from the final score. The method includes determining whether the final score is above a cutoff value, and in response to determining that the final score for the candidate pair is above…

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.