Patent · US Active

Performing utterance detection using convolution

US11769491B1 · kind B1 · utility

3Cited by

1References

21Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Abhishek Bafna · Seattle, US
Haithem Albadawi · Redmond, US

Key dates

Filing date	Sep 29, 2020
Grant date	Sep 26, 2023
Priority date	—
Expiry date	Feb 17, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/088
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A system configured to perform utterance detection using data processing techniques that are similar to those used for object detection is provided. For example, the system may treat utterances within audio data as analogous to an object represented within an image and employ techniques to separate and identify individual utterances. The system may include one or more trained models that are trained to perform utterance detection. For example, the system may include a first module to process input audio data and identify whether speech is represented in the input audio data, a second module to apply convolution filters, and a third module configured to determine a boundary identifying a beginning and ending of a portion of the input audio data along with an utterance score indicating how closely the portion of the input audio data represents an utterance.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.