Performing utterance detection using convolution
US11769491B1 · kind B1 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 29, 2020 |
| Grant date | Sep 26, 2023 |
| Priority date | — |
| Expiry date | Feb 17, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2015/088
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A system configured to perform utterance detection using data processing techniques that are similar to those used for object detection is provided. For example, the system may treat utterances within audio data as analogous to an object represented within an image and employ techniques to separate and identify individual utterances. The system may include one or more trained models that are trained to perform utterance detection. For example, the system may include a first module to process input audio data and identify whether speech is represented in the input audio data, a second module to apply convolution filters, and a third module configured to determine a boundary identifying a beginning and ending of a portion of the input audio data along with an utterance score indicating how closely the portion of the input audio data represents an utterance.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.