Advancing word-based speech recognition processing
US10629193B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 9, 2018 |
| Grant date | Apr 21, 2020 |
| Priority date | — |
| Expiry date | Apr 28, 2038 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2015/223
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Non-limiting examples of the present disclosure describe advancements in acoustic-to-word modeling that improve accuracy in speech recognition processing through the replacement of out-of-vocabulary (OOV) tokens. During the decoding of speech signals, better accuracy in speech recognition processing is achieved through training and implementation of multiple different solutions that present enhanced speech recognition models. In one example, a hybrid neural network model for speech recognition processing combines a word-based neural network model as a primary model and a character-based neural network model as an auxiliary model. The primary word-based model emits a word sequence, and an output of character-based auxiliary model is consulted at a segment where the word-based model emits an OOV token. In another example, a mixed unit speech recognition model is developed and trained to generate a mixed word and character sequence during decoding of a speech signal without requiring generation of OOV tokens.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.