Patent · US Active

Extracting searchable information from a digitized document

US10318593B2 · kind B2 · utility

2Cited by
1References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 21, 2017
Grant dateJun 11, 2019
Priority date
Expiry dateAug 21, 2037

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH04N2201/218
  • WIPO fieldAudio-visual technology
  • WIPO sectorElectrical engineering

Abstract

Data extraction and automatic validation from digitized documents in non-editable formats is disclosed. Paper documents are digitized or converted into formats suitable for storage on computers or other digital devices. The digitized documents are classified into one of a plurality of document types and based on the document type, document processing rules are selected for analyzing the digitized documents to enable data extraction and automatic validation. The positions and values of the data fields in the digitized documents are obtained using machine learning techniques. The data field values are automatically validated and assigned confidence scores. Data fields with low confidence scores are flagged for manual review.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.