Patent · US Active

Dual use of audio noise level in speech-to-text framework

US11335350B2 · kind B2 · utility

4Cited by

0References

30Claims

0Family size

Assignee

SAS INSTITUTE, INC. · US

Inventors

Xiaolong Li · Beijing, CN
Xiaozhuo Cheng · Beijing, CN
Xu Yang · Cary, US

Key dates

Filing date	Oct 12, 2021
Grant date	May 17, 2022
Priority date	—
Expiry date	Oct 12, 2041

Classification

Technology area (CPC G)Physics
CPC primaryG10L2025/783
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

An apparatus includes processor(s) to: perform pre-processing operations including derive an audio noise level of speech audio of a speech data set, derive a first relative weighting for first and second segmentation techniques for identifying likely sentence pauses in the speech audio based on the audio noise level, and select likely sentence pauses for a converged set of likely sentence pauses from likely sentence pauses identified by the first and/or second segmentation techniques based on the first relative weighting; and perform speech-to-text processing operations including divide the speech data set into data segments representing speech segments of the speech audio based on the converged set of likely sentence pauses, and derive a second relative weighting based on the audio noise level for selecting words indicated by an acoustic model or by a language model as being most likely spoken in the speech audio for inclusion in a transcript.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.