Patent · US Active

Document matching and data extraction

US11860950B2 · kind B2 · utility

3Cited by
89References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 30, 2021
Grant dateJan 2, 2024
Priority date
Expiry dateDec 25, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/045
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The system is configured to create a generalized document automation framework that captures relevant data from documents based upon replicating historical human actions associated with a document. The system may use machine vision and natural language processing to match a new document to a document that was already human extracted in an existing corpus. This is accomplished by comparing both visual elements and textual elements. This match can be verified by statistical approaches by comparing the match metrics across multiple documents. After the match has been found and verified, the system then uses the historical extractions from the historical document and maps the extractions to similar regions in the new document based upon again both visual and text commonalities between documents. Data is then extracted from these regions of interest in the new document, sanity checked for data integrity against historical data, and then passed downstream for processing.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.