Disambiguation in speech recognition
US9484021B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 30, 2015 |
| Grant date | Nov 1, 2016 |
| Priority date | — |
| Expiry date | May 11, 2035 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/00
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Automatic speech recognition (ASR) processing including a two-stage configuration. After ASR processing of an incoming utterance where the ASR outputs an N-best list including multiple hypotheses, a first stage determines whether to execute a command associated with one of the hypotheses or whether to output some of the hypotheses of the N-best list for disambiguation. A second stage determines what hypotheses should be included in the disambiguation choices. A first machine learning model is used at the first stage and a second machine learning model is used at the second stage. The multi-stage configuration allows for reduced speech processing errors as well as a reduced number of utterances sent for disambiguation, which thus improves the user experience.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.