Patent · US Active

Embeddings with classes

US11373042B2 · kind B2 · utility

0Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 3, 2019
Grant dateJun 28, 2022
Priority date
Expiry dateDec 24, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06T1/20
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Described herein are systems and methods for word embeddings to avoid the need to throw out rare words appearing less than a certain number of times in a corpus. Embodiments of the present disclosure involve group words into clusters/classes for multiple times using different assignments of the vocabulary words to a number of classes. Multiple copies of the training corpus are then generated using the assignments to replace each word with the appropriate class. A word embedding generating model is run on the multiple class corpora to generate multiple class embeddings. An estimate of the gold word embedding matrix is then reconstructed from multiple pairs of assignments, class embeddings, and covariances. Test results show the effectiveness of embodiments of the present disclosure.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.