Patent · US Expired

Method and apparatus for extracting speech related facial features for use in speech recognition systems

US5771306A · kind A · utility

52Cited by

10References

10Claims

0Family size

Assignee

RICOH COMPANY, LTD. · JP

Inventors

David G. Stork · Stanford, US
Gregory J. Wolff · Redwood City, US
Earl Levine · Palo Alto, US

Key dates

Filing date	Oct 22, 1993
Grant date	Jun 23, 1998
Priority date	—
Expiry date	Oct 22, 2013

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/18
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

The apparatus for the recognition of speech comprises an acoustic preprocessor, a visual preprocessor, and a speech classifier that operates the acoustic and visual preprocessed data. The acoustic preprocessor comprises a log mel spectrum analyzer that produces an equal mel bandwidth log power spectrum. The visual processor detects the motion of a set of fiducial markers on the speaker's face and extracts a set of normalized distance vectors describing lip and mouth movement. The speech classifier uses a multilevel time-delay neural network operating on the preprocessed acoustic and visual data to form an output probability distribution that indicates the probability of each candidate utterance having been spoken, based on the acoustic and visual data.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.