Machine learning data extraction algorithms
US10824811B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 10, 2018 |
| Grant date | Nov 3, 2020 |
| Priority date | — |
| Expiry date | Dec 3, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/10
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Embodiments of the present disclosure pertain to extracting data corresponding to particular data types using machine learning algorithms. In one embodiment, a method includes receiving an image in a backend system, sending the image to an optical character recognition (OCR) component, and in accordance therewith, receiving a plurality of characters recognized in the image. The character set is matched against known values to generate candidate character strings. The character set is processed by one or more machine learning algorithms to produce features. For each candidate character string, the features are then processed by a random forest model to determine a final character string.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.