Patent · US Active

Electronic document content extraction and document type determination

US10909309B2 · kind B2 · utility

0Cited by
33References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJan 12, 2018
Grant dateFeb 2, 2021
Priority date
Expiry dateNov 14, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/418
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method includes receiving content of an electronic document having a document type, the content divided into components each having a unique identifier and selecting an extraction schema based on the document type, the extraction schema having a plurality of data categories. For each of the components, the extraction schema is applied to identify content of the component that corresponds to individual ones of the data categories and saving, with the processor, in an electronic data storage, in a record associated with the component, category metadata indicative of content of the component corresponding to the data categories. In response to obtaining the category metadata for each of the components, applying the extraction schema to the content metadata of each of the components and to the electronic document as a whole to determine document metadata. A user interface displays the document metadata on the user interface.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.