Patent · US Active

Pose-invariant visual speech recognition using a single view input

US10937428B2 · kind B2 · utility

0Cited by

2References

20Claims

0Family size

Assignee

Adobe Inc. · US

Inventor

Yaman Kumar · New Delhi, IN

Key dates

Filing date	Mar 11, 2019
Grant date	Mar 2, 2021
Priority date	—
Expiry date	Nov 15, 2039

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/221
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A pose-invariant visual speech recognition system obtains a single view input of a speaker, such as a single video stream captured by a single camera. The single view input provides a particular pose of the speaker, which refers to a view angle, relative to the lens or image capture component of the camera that captured the video of the speaker, at which the speaker's face is captured. The pose of the speaker is used to select a visual speech recognition model to use to generate a text label that is the words spoken by the speaker. One or more additional view angles of the speaker are also generated from the single view input of the speaker. These one or more additional view angles, along with the single view input of the speaker, are used by the selected visual speech recognition model to generate the text label for the speaker.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.