Method and system for unsupervised discovery of unigrams in speech recognition systems
US11984116B2 · kind B2 · utility
Inventors
Key dates
| Filing date | Nov 8, 2021 |
| Grant date | May 14, 2024 |
| Priority date | — |
| Expiry date | Dec 1, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2015/0635
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method of automatically discovering unigrams in a speech data element may include receiving a language model that includes a plurality of n-grams, where each n-gram includes one or more unigrams; applying an acoustic machine-learning (ML) model on one or more speech data elements to obtain a character distribution function; applying a greedy decoder on the character distribution function, to predict an initial corpus of unigrams; filtering out one or more unigrams of the initial corpus to obtain a corpus of candidate unigrams, where the candidate unigrams are not included in the language model; analyzing the one or more first speech data elements, to extract at least one n-gram that comprises a candidate unigram; and updating the language model to include the extracted at least one n-gram.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.