Deep document processing with self-supervised learning
US11954139B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 19, 2020 |
| Grant date | Apr 9, 2024 |
| Priority date | — |
| Expiry date | May 19, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/43
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A document processing system processes documents including typewritten and/or handwritten data by converting them to document images for entity extraction. A received document is initially processed to generate a deep document data structured and for classification as one of a structured or an unstructured document. If the document is classified as a structured document, it is processed for entity extraction based on a matching template and image alignment of the document image with the matching template. If the document is classified as an unstructured document, entities are extracted by obtaining nodes and providing the nodes to a self-supervised masked visual language model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.