Patent · US Active

Multi-stage machine learning model training for key-value extraction

US12361736B2 · kind B2 · utility

0Cited by
16References
20Claims
0Family size

Inventors

Key dates

Filing dateJan 4, 2023
Grant dateJul 15, 2025
Priority date
Expiry dateOct 26, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/1448
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Techniques for multi-stage training of a machine learning model to extract key-value pairs from documents are disclosed. A system trains a machine learning model using a set of training data including unlabeled documents of various document categories. The initial stage identifies relationships among tokens, or words, numbers, and punctuation, in documents. The system re-trains the machine learning model using a set of training data which includes a particular category of documents while excluding other categories of documents. The second training stage is a supervised machine learning stage in which the training data is labeled to identify key-value pairs in the documents. In the initial training stage, the system sets parameters of the machine learning model to an initial state. In the second stage, the system modifies the parameters of the machine learning model based on the characteristics of the training data set including the documents of the particular category.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.