Layout-agnostic clustering-based classification of document keys and values
US10872236B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 28, 2018 |
| Grant date | Dec 22, 2020 |
| Priority date | — |
| Expiry date | Feb 14, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/10
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques for layout-agnostic clustering-based classification of document keys and values are described. A key-value differentiation unit generates feature vectors corresponding to text elements of a form represented within an electronic image using a machine learning (ML) model. The ML model was trained utilizing a loss function that separates keys from values. The feature vectors are clustered into at least two clusters, and a cluster is determined to include either keys of the form or values of the form via identifying neighbors between feature vectors of the cluster(s) with labeled feature vectors.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.