Multi-stage machine learning model training for key-value extraction
US12361736B2 · kind B2 · utility
Inventors
Key dates
| Filing date | Jan 4, 2023 |
| Grant date | Jul 15, 2025 |
| Priority date | — |
| Expiry date | Oct 26, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06V30/1448
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques for multi-stage training of a machine learning model to extract key-value pairs from documents are disclosed. A system trains a machine learning model using a set of training data including unlabeled documents of various document categories. The initial stage identifies relationships among tokens, or words, numbers, and punctuation, in documents. The system re-trains the machine learning model using a set of training data which includes a particular category of documents while excluding other categories of documents. The second training stage is a supervised machine learning stage in which the training data is labeled to identify key-value pairs in the documents. In the initial training stage, the system sets parameters of the machine learning model to an initial state. In the second stage, the system modifies the parameters of the machine learning model based on the characteristics of the training data set including the documents of the particular category.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.