Method and system for speech reconstruction from speech recognition features, pitch and voicing with resampled basis functions providing reconstruction of the spectral envelope
US6725190B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 2, 1999 |
| Grant date | Apr 20, 2004 |
| Priority date | — |
| Expiry date | Nov 2, 2019 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L25/18
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A speech reconstruction method and system for converting a series of binned spectra or functions thereof such as the Mel Frequency Cepstra Coefficients (MFCC), of an original digitized speech signal, into a reconstructed speech signal, where each binned spectrum has a respective pitch value and voicing decision. The binned spectra are derived from the original digitized speech signal at successive instances by multiplying each estimate of the spectral envelope by a predetermined set of frequency domain window functions and computing the integrals thereof. At each respective time instance, harmonic frequencies and weights are generated according to the respective pitch value and voicing decision. Basis functions having bounded supports on the frequency axis are each sampled at all said harmonic frequencies, which are within its support and multiplied by respective harmonic weights. The sampled basis functions are combined with respective phases, generated according to the pitch value, voicing decision and possibly the binned spectrum, resulting in a complex line spectrum corresponding to each basis function. Coefficients are generated of the basis functions, and each of the points o…
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.