Structured document analyzer
US10839245B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 25, 2019 |
| Grant date | Nov 17, 2020 |
| Priority date | — |
| Expiry date | Apr 17, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/10
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A structured document analyzer that associates keys and values in structured documents based on key, value, and key-value container bounding boxes. A trained machine learning model analyzes images of structured documents to determine bounding boxes for keys, values, and key-value containers in the images with confidence scores for the classifications. For each image, duplicate bounding boxes are removed, and then a set of key-value containers are selected and sorted based on the confidence scores. For each key-value container, a best key and value are determined for the container based on overlap of the key and value bounding boxes with the container bounding box and the confidence scores. Optical character recognition may be performed on the image to determine text for the keys and values.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.