Patent · US Active

Method and apparatus for combined learning using feature enhancement based on deep neural network and modified loss function for speaker recognition robust to noisy environments

US11854554B2 · kind B2 · utility

3Cited by
1References
4Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMar 30, 2020
Grant dateDec 26, 2023
Priority date
Expiry dateSep 14, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2021/02166
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Presented are a combined learning method and device using a transformed loss function and feature enhancement based on a deep neural network for speaker recognition that is robust to a noisy environment. The combined learning method using the transformed loss function and the feature enhancement based on the deep neural network for speaker recognition that is robust to the noisy environment, according to an embodiment, may comprise: a preprocessing step for learning to receive, as an input, a speech signal and remove a noise or reverberation component by using at least one of a beamforming algorithm and a dereverberation algorithm using the deep neural network; a speaker embedding step for learning to classify an utterer from the speech signal, from which a noise or reverberation component has been removed, by using a speaker embedding model based on the deep neural network; and a step for, after connecting a deep neural network model included in at least one of the beamforming algorithm and the dereverberation algorithm and the speaker embedding model, for speaker embedding, based on the deep neural network, performing combined learning by using a loss function.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.