Patent · US Active

Systems and methods for machine learning based content extraction from document images

US10867171B1 · kind B1 · utility

14Cited by

1References

20Claims

0Family size

Assignee

OMNISCIENCE CORP. · US

Inventors

Alexander Wesley Contryman · San Francisco, US
Jacob Ryan van Gogh · Mountain View, US
Manu Shukla · Sterling, US

Key dates

Filing date	Oct 22, 2018
Grant date	Dec 15, 2020
Priority date	—
Expiry date	May 10, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG06V30/414
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method and apparatus for recognizing and extracting data from a form depicted within an image of a document are described. The method may include receiving the image of the document, the image depicting the form and data contained one the form. The method may also include transforming the image of the document to a set of one or more key, value pairs by processing the image of the document with a sequence of two or more trained machine learning based image analysis processes, wherein keys are relevant to forms of the type depicted in the form, and wherein each value is associated with a key. The method may also include generating a data output that comprises the set of key, value pairs for textual data recognized and extracted from the form depicted in the image.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.