Vision-assisted speech processing
US11257493B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Jul 11, 2019 |
| Grant date | Feb 22, 2022 |
| Priority date | — |
| Expiry date | May 31, 2040 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2015/223
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Systems and methods for processing speech are described. In certain examples, image data is used to generate visual feature tensors and audio data is used to generate audio feature tensors. The visual feature tensors and the audio feature tensors are used by a linguistic model to determine linguistic features that are usable to parse an utterance of a user. The generation of the feature tensors may be jointly configured with the linguistic model. Systems may be provided in a client-server architecture.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.