Patent · US Active

Method and system for unsupervised discovery of unigrams in speech recognition systems

US11984116B2 · kind B2 · utility

0Cited by
1References
17Claims
0Family size

Inventors

Key dates

Filing dateNov 8, 2021
Grant dateMay 14, 2024
Priority date
Expiry dateDec 1, 2042

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2015/0635
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method of automatically discovering unigrams in a speech data element may include receiving a language model that includes a plurality of n-grams, where each n-gram includes one or more unigrams; applying an acoustic machine-learning (ML) model on one or more speech data elements to obtain a character distribution function; applying a greedy decoder on the character distribution function, to predict an initial corpus of unigrams; filtering out one or more unigrams of the initial corpus to obtain a corpus of candidate unigrams, where the candidate unigrams are not included in the language model; analyzing the one or more first speech data elements, to extract at least one n-gram that comprises a candidate unigram; and updating the language model to include the extracted at least one n-gram.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.