Patent · US Active

End-to-end automated speech recognition on numeric sequences

US11367432B2 · kind B2 · utility

2Cited by
1References
26Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 26, 2020
Grant dateJun 21, 2022
Priority date
Expiry dateSep 3, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/045
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A method for generating final transcriptions representing numerical sequences of utterances in a written domain includes receiving audio data for an utterance containing a numeric sequence, and decoding, using a sequence-to-sequence speech recognition model, the audio data for the utterance to generate, as output from the sequence-to-sequence speech recognition model, an intermediate transcription of the utterance. The method also includes processing, using a neural corrector/denormer, the intermediate transcription to generate a final transcription that represents the numeric sequence of the utterance in a written domain. The neural corrector/denormer is trained on a set of training samples, where each training sample includes a speech recognition hypothesis for a training utterance and a ground-truth transcription of the training utterance. The ground-truth transcription of the training utterance is in the written domain. The method also includes providing the final transcription representing the numeric sequence of the utterance in the written domain for output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.