Patent · US Active

Generating training sets to train machine learning models

US11514691B2 · kind B2 · utility

0Cited by
1References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateJun 12, 2019
Grant dateNov 29, 2022
Priority date
Expiry dateSep 25, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06V30/40
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A computer system trains a machine learning model. A vector representation is generated for each document in a collection of documents. The documents are clustered based on the vector representations of the documents to produce a plurality of clusters. A training set is produced by selecting one or more documents from each cluster, wherein the selected documents represent a sample of the collection of documents to train the machine learning model. The machine learning model is trained by applying the training set to the machine learning model. Embodiments of the present invention further include a method and program product for training a machine learning model in substantially the same manner described above.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.