Patent · US Active

Systems and methods for machine learning based content extraction from document images

US10867171B1 · kind B1 · utility

14Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 22, 2018
Grant dateDec 15, 2020
Priority date
Expiry dateMay 10, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/414
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method and apparatus for recognizing and extracting data from a form depicted within an image of a document are described. The method may include receiving the image of the document, the image depicting the form and data contained one the form. The method may also include transforming the image of the document to a set of one or more key, value pairs by processing the image of the document with a sequence of two or more trained machine learning based image analysis processes, wherein keys are relevant to forms of the type depicted in the form, and wherein each value is associated with a key. The method may also include generating a data output that comprises the set of key, value pairs for textual data recognized and extracted from the form depicted in the image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.