Systems and methods for machine learning based content extraction from document images
US10867171B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 22, 2018 |
| Grant date | Dec 15, 2020 |
| Priority date | — |
| Expiry date | May 10, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/414
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method and apparatus for recognizing and extracting data from a form depicted within an image of a document are described. The method may include receiving the image of the document, the image depicting the form and data contained one the form. The method may also include transforming the image of the document to a set of one or more key, value pairs by processing the image of the document with a sequence of two or more trained machine learning based image analysis processes, wherein keys are relevant to forms of the type depicted in the form, and wherein each value is associated with a key. The method may also include generating a data output that comprises the set of key, value pairs for textual data recognized and extracted from the form depicted in the image.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.