Patent · US Active

Advancing word-based speech recognition processing

US10629193B2 · kind B2 · utility

8Cited by

2References

20Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Guoli Ye · Sammamish, US
James G. Droppo · Carnation, US
Jinyu Li · Beijing, CN
Rui Zhao · Beijing, CN
Yifan Gong · Sammamish, US

Key dates

Filing date	Mar 9, 2018
Grant date	Apr 21, 2020
Priority date	—
Expiry date	Apr 28, 2038

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/223
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Non-limiting examples of the present disclosure describe advancements in acoustic-to-word modeling that improve accuracy in speech recognition processing through the replacement of out-of-vocabulary (OOV) tokens. During the decoding of speech signals, better accuracy in speech recognition processing is achieved through training and implementation of multiple different solutions that present enhanced speech recognition models. In one example, a hybrid neural network model for speech recognition processing combines a word-based neural network model as a primary model and a character-based neural network model as an auxiliary model. The primary word-based model emits a word sequence, and an output of character-based auxiliary model is consulted at a segment where the word-based model emits an OOV token. In another example, a mixed unit speech recognition model is developed and trained to generate a mixed word and character sequence during decoding of a speech signal without requiring generation of OOV tokens.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.