End-to-end speaker recognition using deep neural network
US9824692B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 12, 2016 |
| Grant date | Nov 21, 2017 |
| Priority date | — |
| Expiry date | Sep 12, 2036 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L17/22
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
The present invention is directed to a deep neural network (DNN) having a triplet network architecture, which is suitable to perform speaker recognition. In particular, the DNN includes three feed-forward neural networks, which are trained according to a batch process utilizing a cohort set of negative training samples. After each batch of training samples is processed, the DNN may be trained according to a loss function, e.g., utilizing a cosine measure of similarity between respective samples, along with positive and negative margins, to provide a robust representation of voiceprints.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.