Rare word recognition with LM-aware MWER training
US12354598B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 21, 2023 |
| Grant date | Jul 8, 2025 |
| Priority date | — |
| Expiry date | Jan 17, 2044 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/22
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method includes generating, using an audio encoder, a higher-order feature representation for each acoustic frame in a sequence of acoustic frames; generating, using a decoder, based on the higher-order feature representation, a plurality of speech recognition hypotheses, each hypotheses corresponding to a candidate transcription of an utterance and having an associated first likelihood score; generating, using an external language model, for each speech recognition hypothesis, a second likelihood score; determining, using a learnable fusion module, for each speech recognition hypothesis, a set of fusion weights based on the higher-order feature representation and the speech recognition hypothesis; and generating, using the learnable fusion module, for each speech recognition hypothesis, a third likelihood score based on the first likelihood score, the second likelihood score, and the set of fusion weights, the audio encoder and decoder trained using minimum additive error rate training in the presence of the external language model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.