Patent · US Active

ML using n-gram induced input representation

US11836438B2 · kind B2 · utility

2Cited by
3References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 13, 2021
Grant dateDec 5, 2023
Priority date
Expiry dateApr 13, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/094
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Generally discussed herein are devices, systems, and methods for generating an embedding that is both local string dependent and global string dependent. The generated embedding can improve machine learning (ML) model performance. A method can include converting a string of words to a series of tokens, generating a local string-dependent embedding of each token of the series of tokens, generating a global string-dependent embedding of each token of the series of tokens, combining the local string dependent embedding the global string dependent embedding to generate an n-gram induced embedding of each token of the series of tokens, obtaining a masked language model (MLM) previously trained to generate a masked word prediction, and executing the MLM based on the n-based induced embedding of each token to generate the masked word prediction.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.