End-to-end automated speech recognition on numeric sequences
US11367432B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Mar 26, 2020 |
| Grant date | Jun 21, 2022 |
| Priority date | — |
| Expiry date | Sep 3, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/045
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method for generating final transcriptions representing numerical sequences of utterances in a written domain includes receiving audio data for an utterance containing a numeric sequence, and decoding, using a sequence-to-sequence speech recognition model, the audio data for the utterance to generate, as output from the sequence-to-sequence speech recognition model, an intermediate transcription of the utterance. The method also includes processing, using a neural corrector/denormer, the intermediate transcription to generate a final transcription that represents the numeric sequence of the utterance in a written domain. The neural corrector/denormer is trained on a set of training samples, where each training sample includes a speech recognition hypothesis for a training utterance and a ground-truth transcription of the training utterance. The ground-truth transcription of the training utterance is in the written domain. The method also includes providing the final transcription representing the numeric sequence of the utterance in the written domain for output.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.