Patent · US Active

Very deep convolutional neural networks for end-to-end speech recognition

US10510004B2 · kind B2 · utility

1Cited by
0References
10Claims
0Family size

Assignee

Inventors

Key dates

Filing dateApr 10, 2019
Grant dateDec 17, 2019
Priority date
Expiry dateApr 10, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/22
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A speech recognition neural network system includes an encoder neural network and a decoder neural network. The encoder neural network generates an encoded sequence from an input acoustic sequence that represents an utterance. The input acoustic sequence includes a respective acoustic feature representation at each of a plurality of input time steps, the encoded sequence includes a respective encoded representation at each of a plurality of time reduced time steps, and the number of time reduced time steps is less than the number of input time steps. The encoder neural network includes a time reduction subnetwork, a convolutional LSTM subnetwork, and a network in network subnetwork. The decoder neural network receives the encoded sequence and processes the encoded sequence to generate, for each position in an output sequence order, a set of substring scores that includes a respective substring score for each substring in a set of substrings.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.