Patent · US Active

Efficient memory transformer based acoustic model for low latency streaming speech recognition

US11646017B1 · kind B1 · utility

1Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 5, 2021
Grant dateMay 9, 2023
Priority date
Expiry dateSep 10, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/28
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

In one embodiment, a method includes accessing a machine-learning model configured to generate an encoding for an utterance by using a module to process data associated with each segment of the utterance in a series of iterations, performing operations associated with an i-th segment during an n-th iteration by the module, which include receiving an input comprising input contextual embeddings generated for the i-th segment in a preceding iteration and a memory bank storing memory vectors generated in the preceding iteration for segments preceding the i-th segment, generating attention outputs and a memory vector based on keys, values, and queries generated using the input, and generating output contextual embeddings for the i-th segment based on the attention outputs, providing the memory vector to the module for performing operations associated with the i-th segment in a next iteration, and performing speech recognition by decoding the encoding of the utterance.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.