Patent · US Active

Code-switching speech recognition with end-to-end connectionist temporal classification model

US10964309B2 · kind B2 · utility

2Cited by
1References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 13, 2019
Grant dateMar 30, 2021
Priority date
Expiry dateMay 13, 2039

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2015/0635
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A CS CTC model may be initialed from a major language CTC model by keeping network hidden weights and replacing output tokens with a union of major and secondary language output tokens. The initialized model may be trained by updating parameters with training data from both languages, and a LID model may also be trained with the data. During a decoding process for each of a series of audio frames, if silence dominates a current frame then a silence output token may be emitted. If silence does not dominate the frame, then a major language output token posterior vector from the CS CTC model may be multiplied with the LID major language probability to create a probability vector from the major language. A similar step is performed for the secondary language, and the system may emit an output token associated with the highest probability across all tokens from both languages.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.