Patent · US Active

Method and system for generating parsed document from digital document

US11200412B2 · kind B2 · utility

2Cited by
1References
9Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 27, 2017
Grant dateDec 14, 2021
Priority date
Expiry dateAug 27, 2038

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/10
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and system for generating a parsed document from a digital document. The method includes segmenting the digital document into at least one section; classifying the at least one section of the digital document into at least one of a class: text class, table class, figure class, noise class; identifying a reading order of the digital document; and processing each of the at least one section of the digital document. Furthermore, processing each of the at least one section of the digital document comprises extracting content from each of the at least one section based on the class; and structuring the extracted content based on the reading order to generate the parsed document.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.