Patent · US Active

Speech recognition using connectionist temporal classification

US10580432B2 · kind B2 · utility

0Cited by

1References

20Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Amit Das · Pickerington, US
Jinyu Li · Beijing, CN
Rui Zhao · Beijing, CN
Yifan Gong · Sammamish, US

Key dates

Filing date	Feb 28, 2018
Grant date	Mar 3, 2020
Priority date	—
Expiry date	Aug 25, 2038

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/0635
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Generally discussed herein are devices, systems, and methods for speech recognition. Processing circuitry can implement a connectionist temporal classification (CTC) neural network (NN) including an encode NN to receive an audio frame and generate a current encoded hidden feature vector, an attend NN to generate, based on a current encoded hidden feature vector and a first context vector from a previous time slice, a weight vector indicating an amount the current encoded hidden feature vector, a previous encoded hidden feature vector, and a future encoded hidden feature vector from a future time slice contribute to a current, second context vector, an annotate NN to generate the current, second context vector based on the weight vector, the current encoded hidden feature vector, the previous encoded hidden feature vector, and the future encoded hidden feature vector, and a normal NN to generate a normalized output vector based on the context vector.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.