Method and apparatus for language and speaker recognition
US5189727A · kind A · utility
Assignees
Inventor
Key dates
| Filing date | Aug 26, 1991 |
| Grant date | Feb 23, 1993 |
| Priority date | — |
| Expiry date | Aug 26, 2011 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L17/04
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
An initial learning phase creates histograms for each of the languages to be recognized. A first pass enters a number of samples of speech, and at each predetermined instant of time, each sample of speech is Fast Fourier Transformed (FFT) to create a spectrum showing frequency content of the speech at that instant of time (a spectral vector). The frequency content is compared with frequency contents which have been previously stored. If the current spectral vector is close enough to a previously stored spectral vector, a weighted average between the two is formed, and a weight indicating frequency of occurrence is incremented. If the current value is not similar to one which has been previously stored, it is stored with an initial weight of "1". The most common frequency spectra are determined for all of the languages grouped together to form a composite basis set. A second pass then puts a sample of sounds through the Fast Fourier Transform to again obtain frequency spectrums. The obtained frequency spectrums are compared against all of the prestored frequency spectra in the composite basis set, and a closest match is determined. A number of occurrences of each frequency spectra i…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.