Patent · US Active

Data extraction confidence attribute with transformations

US8676731B1 · kind B1 · utility

26Cited by
0References
28Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJul 11, 2011
Grant dateMar 18, 2014
Priority date
Expiry dateMar 26, 2032

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/40
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A data extraction system for receiving and scanning documents to generate ordered input for storage in a database employs a non-linear statistical model for a data extraction sequence having a plurality of transformations. Each transformation transitions an extracted data value in various forms from a raw data image to a computed data value. For each transformation, a confidence model learns a confidence component for the particular transformation. The learned confidence components, generated from a control set of documents having known values, are employed in a production mode with actual raw data. The confidence component corresponds to a likelihood of transformation accuracy, and the confidence model aggregates the confidence components to compute a confidence for the extracted data value. A database stores the extracted data value labeled with the computed confidence attribute for subsequent use by an application employing the extracted data.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.