Speaker recognition using local models
US7475013B2 · kind B2 · utility
Assignee
Inventor
Key dates
| Filing date | Mar 26, 2004 |
| Grant date | Jan 6, 2009 |
| Priority date | — |
| Expiry date | Jul 6, 2026 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L17/08
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system and method for voice recognition is disclosed. The system enrolls speakers using an enrollment voice samples and identification information. An extraction module characterizes enrollment voice samples with high-dimensional feature vectors or speaker data points. A data structuring module organizes data points into a high-dimensional data structure, such as a kd-tree, in which similarity between data points dictates a distance, such as a Euclidean distance, a Minkowski distance, or a Manhattan distance. The system recognizes a speaker using an unidentified voice sample. A data querying module searches the data structure to generate a subset of approximate nearest neighbors based on an extracted high-dimensional feature vector. A data modeling module uses Parzen windows to estimate a probability density function representing how closely characteristics of the unidentified speaker match enrolled speakers, in real-time, without extensive training data or parametric assumptions about data distribution.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.