Context-dependent acoustic models for medium and large vocabulary speech recognition with eigenvoice training
US6571208B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Nov 29, 1999 |
| Grant date | May 27, 2003 |
| Priority date | — |
| Expiry date | Nov 29, 2019 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L15/07
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A reduced dimensionality eigenvoice analytical technique is used during training to develop context-dependent acoustic models for allophones. The eigenvoice technique is also used during run time upon the speech of a new speaker. The technique removes individual speaker idiosyncrasies, to produce more universally applicable and robust allophone models. In one embodiment the eigenvoice technique is used to identify the centroid of each speaker, which may then be “subtracted out” of the recognition equation. In another embodiment maximum likelihood estimation techniques are used to develop common decision tree frameworks that may be shared across all speakers when constructing the eigenvoice representation of speaker space.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.