Patent · US Active

Predicting word boundaries for on-device batching of end-to-end speech recognition models

US12322383B2 · kind B2 · utility

0Cited by

0References

26Claims

0Family size

Assignee

Google LLC · US

Inventors

Shaan Jagdeep Patrick Bijwadia · San Francisco, US
Tara N. Sainath · Jersey City, US
Jiahui Yu · Champaign, US
Shuo-yiin Chang · Sunnyvale, US
Yangzhang He · Mountain View, US

Key dates

Filing date	Sep 21, 2022
Grant date	Jun 3, 2025
Priority date	—
Expiry date	Jun 2, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG06N3/09
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method includes receiving a sequence of input audio frames corresponding to an utterance captured by a user device, the utterance including a plurality of words. For each input audio frame, the method includes predicting, using a word boundary detection model configured receive the sequence of input audio frames as input, whether the input audio frame is a word boundary. The method includes batching the input audio frames into a plurality of batches based on the input audio frames predicted as word boundaries, wherein each batch includes a corresponding plurality of batched input audio frames. For each of the plurality of batches, the method includes processing, using a speech recognition model, the corresponding plurality of batched input audio frames in parallel to generate a speech recognition result.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.