Patent · US Expired

Method and apparatus for extracting speech related facial features for use in speech recognition systems

US5771306A · kind A · utility

52Cited by
10References
10Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 22, 1993
Grant dateJun 23, 1998
Priority date
Expiry dateOct 22, 2013

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L25/18
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

The apparatus for the recognition of speech comprises an acoustic preprocessor, a visual preprocessor, and a speech classifier that operates the acoustic and visual preprocessed data. The acoustic preprocessor comprises a log mel spectrum analyzer that produces an equal mel bandwidth log power spectrum. The visual processor detects the motion of a set of fiducial markers on the speaker's face and extracts a set of normalized distance vectors describing lip and mouth movement. The speech classifier uses a multilevel time-delay neural network operating on the preprocessed acoustic and visual data to form an output probability distribution that indicates the probability of each candidate utterance having been spoken, based on the acoustic and visual data.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.