Patent · US Active

Code-switching speech recognition with end-to-end connectionist temporal classification model

US10964309B2 · kind B2 · utility

2Cited by

1References

20Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Jinyu Li · Beijing, CN
Guoli Ye · Sammamish, US
Rui Zhao · Beijing, CN
Yifan Gong · Sammamish, US
Ke Li · Qingdao, CN

Key dates

Filing date	May 13, 2019
Grant date	Mar 30, 2021
Priority date	—
Expiry date	May 13, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/0635
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A CS CTC model may be initialed from a major language CTC model by keeping network hidden weights and replacing output tokens with a union of major and secondary language output tokens. The initialized model may be trained by updating parameters with training data from both languages, and a LID model may also be trained with the data. During a decoding process for each of a series of audio frames, if silence dominates a current frame then a silence output token may be emitted. If silence does not dominate the frame, then a major language output token posterior vector from the CS CTC model may be multiplied with the LID major language probability to create a probability vector from the major language. A similar step is performed for the secondary language, and the system may emit an output token associated with the highest probability across all tokens from both languages.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.