Patent · US Active

Method and apparatus for combined learning using feature enhancement based on deep neural network and modified loss function for speaker recognition robust to noisy environments

US11854554B2 · kind B2 · utility

3Cited by

1References

4Claims

0Family size

Assignee

IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY) · KR

Inventors

Joon Chang · Austin, US
Joonyoung Yang · Bucheon-si, KR

Key dates

Filing date	Mar 30, 2020
Grant date	Dec 26, 2023
Priority date	—
Expiry date	Sep 14, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L2021/02166
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Presented are a combined learning method and device using a transformed loss function and feature enhancement based on a deep neural network for speaker recognition that is robust to a noisy environment. The combined learning method using the transformed loss function and the feature enhancement based on the deep neural network for speaker recognition that is robust to the noisy environment, according to an embodiment, may comprise: a preprocessing step for learning to receive, as an input, a speech signal and remove a noise or reverberation component by using at least one of a beamforming algorithm and a dereverberation algorithm using the deep neural network; a speaker embedding step for learning to classify an utterer from the speech signal, from which a noise or reverberation component has been removed, by using a speaker embedding model based on the deep neural network; and a step for, after connecting a deep neural network model included in at least one of the beamforming algorithm and the dereverberation algorithm and the speaker embedding model, for speaker embedding, based on the deep neural network, performing combined learning by using a loss function.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.