Large margin training for attention-based end-to-end speech recognition
US10861441B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Feb 14, 2019 |
| Grant date | Dec 8, 2020 |
| Priority date | — |
| Expiry date | May 23, 2039 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2015/0635
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A method of attention-based end-to-end (E2E) automatic speech recognition (ASR) training, includes performing cross-entropy training of a model, based on one or more input features of a speech signal, performing beam searching of the model of which the cross-entropy training is performed, to generate an n-best hypotheses list of output hypotheses, and determining a one-best hypothesis among the generated n-best hypotheses list. The method further includes determining a character-based gradient and a word-based gradient, based on the model of which the cross-entropy training is performed and a loss function in which a distance between a reference sequence and the determined one-best hypothesis is maximized, and performing backpropagation of the determined character-based gradient and the determined word-based gradient to the model, to update the model.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.