Patent · US Active

End-to-end automated speech recognition on numeric sequences

US11367432B2 · kind B2 · utility

2Cited by

1References

26Claims

0Family size

Assignee

Google LLC · US

Inventors

Charles Caleb Peyser · New York, US
Hao Zhang · Shanghai, CN
Tara N. Sainath · Jersey City, US
Zelin Wu · Shanghai, CN

Key dates

Filing date	Mar 26, 2020
Grant date	Jun 21, 2022
Priority date	—
Expiry date	Sep 3, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/045
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method for generating final transcriptions representing numerical sequences of utterances in a written domain includes receiving audio data for an utterance containing a numeric sequence, and decoding, using a sequence-to-sequence speech recognition model, the audio data for the utterance to generate, as output from the sequence-to-sequence speech recognition model, an intermediate transcription of the utterance. The method also includes processing, using a neural corrector/denormer, the intermediate transcription to generate a final transcription that represents the numeric sequence of the utterance in a written domain. The neural corrector/denormer is trained on a set of training samples, where each training sample includes a speech recognition hypothesis for a training utterance and a ground-truth transcription of the training utterance. The ground-truth transcription of the training utterance is in the written domain. The method also includes providing the final transcription representing the numeric sequence of the utterance in the written domain for output.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.