Patent · US Active

Speaker recognition using local models

US7475013B2 · kind B2 · utility

7Cited by
4References
37Claims
0Family size

Assignee

Inventor

Key dates

Filing dateMar 26, 2004
Grant dateJan 6, 2009
Priority date
Expiry dateJul 6, 2026

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L17/08
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A system and method for voice recognition is disclosed. The system enrolls speakers using an enrollment voice samples and identification information. An extraction module characterizes enrollment voice samples with high-dimensional feature vectors or speaker data points. A data structuring module organizes data points into a high-dimensional data structure, such as a kd-tree, in which similarity between data points dictates a distance, such as a Euclidean distance, a Minkowski distance, or a Manhattan distance. The system recognizes a speaker using an unidentified voice sample. A data querying module searches the data structure to generate a subset of approximate nearest neighbors based on an extracted high-dimensional feature vector. A data modeling module uses Parzen windows to estimate a probability density function representing how closely characteristics of the unidentified speaker match enrolled speakers, in real-time, without extensive training data or parametric assumptions about data distribution.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.