Patent · US Active

Dual use of audio noise level in speech-to-text framework

US11335350B2 · kind B2 · utility

4Cited by
0References
30Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 12, 2021
Grant dateMay 17, 2022
Priority date
Expiry dateOct 12, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L2025/783
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

An apparatus includes processor(s) to: perform pre-processing operations including derive an audio noise level of speech audio of a speech data set, derive a first relative weighting for first and second segmentation techniques for identifying likely sentence pauses in the speech audio based on the audio noise level, and select likely sentence pauses for a converged set of likely sentence pauses from likely sentence pauses identified by the first and/or second segmentation techniques based on the first relative weighting; and perform speech-to-text processing operations including divide the speech data set into data segments representing speech segments of the speech audio based on the converged set of likely sentence pauses, and derive a second relative weighting based on the audio noise level for selecting words indicated by an acoustic model or by a language model as being most likely spoken in the speech audio for inclusion in a transcript.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.