Patent · US Active

Very deep convolutional neural networks for end-to-end speech recognition

US11080599B2 · kind B2 · utility

0Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateNov 22, 2019
Grant dateAug 3, 2021
Priority date
Expiry dateNov 22, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/22
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A speech recognition neural network system includes an encoder neural network and a decoder neural network. The encoder neural network generates an encoded sequence from an input acoustic sequence that represents an utterance. The input acoustic sequence includes a respective acoustic feature representation at each of a plurality of input time steps, the encoded sequence includes a respective encoded representation at each of a plurality of time reduced time steps, and the number of time reduced time steps is less than the number of input time steps. The encoder neural network includes a time reduction subnetwork, a convolutional LSTM subnetwork, and a network in network subnetwork. The decoder neural network receives the encoded sequence and processes the encoded sequence to generate, for each position in an output sequence order, a set of sub string scores that includes a respective sub string score for each substring in a set of substrings.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.