Patent · US Active

Efficient memory transformer based acoustic model for low latency streaming speech recognition

US11646017B1 · kind B1 · utility

1Cited by

0References

20Claims

0Family size

Assignee

Meta Platforms, Inc. · US

Inventors

Yangyang Shi · San Diego, US
Yongqiang Wang · Shanghai, CN
Chunyang Wu · Greer, US
Ching-Feng Yeh · Hsinchu, TW
Julian Chan · Seattle, US
Qiaochu Zhang · Beijing, CN
Duc Hoang Le · Sunnyvale, US
Michael L. Seltzer · Seattle, US

Key dates

Filing date	Mar 5, 2021
Grant date	May 9, 2023
Priority date	—
Expiry date	Sep 10, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/28
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

In one embodiment, a method includes accessing a machine-learning model configured to generate an encoding for an utterance by using a module to process data associated with each segment of the utterance in a series of iterations, performing operations associated with an i-th segment during an n-th iteration by the module, which include receiving an input comprising input contextual embeddings generated for the i-th segment in a preceding iteration and a memory bank storing memory vectors generated in the preceding iteration for segments preceding the i-th segment, generating attention outputs and a memory vector based on keys, values, and queries generated using the input, and generating output contextual embeddings for the i-th segment based on the attention outputs, providing the memory vector to the module for performing operations associated with the i-th segment in a next iteration, and performing speech recognition by decoding the encoding of the utterance.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.