Patent · US Active

Speaker-invariant training via adversarial learning

US10347241B1 · kind B1 · utility

10Cited by

0References

20Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Zhong Meng · Seattle, US
Vadim Mazalov · Issaquah, US
Yifan Gong · Sammamish, US
Yong Zhao · Beijing, CN
Zhuo Chen · Markham, CA
Jinyu Li · Beijing, CN

Key dates

Filing date	Mar 23, 2018
Grant date	Jul 9, 2019
Priority date	—
Expiry date	Mar 23, 2038

Classification

Technology area (CPC G)Physics
CPC primaryG10L17/18
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Systems and methods can be implemented to conduct speaker-invariant training for speech recognition in a variety of applications. An adversarial multi-task learning scheme for speaker-invariant training can be implemented, aiming at actively curtailing the inter-talker feature variability, while maximizing its senone discriminability to enhance the performance of a deep neural network (DNN) based automatic speech recognition system. In speaker-invariant training, a DNN acoustic model and a speaker classifier network can be jointly optimized to minimize the senone (triphone state) classification loss, and simultaneously mini-maximize the speaker classification loss. A speaker invariant and senone-discriminative intermediate feature is learned through this adversarial multi-task learning, which can be applied to an automatic speech recognition system. Additional systems and methods are disclosed.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.