Patent · US Active

Systems and methods for human listening and live captioning

US11922963B2 · kind B2 · utility

0Cited by
0References
22Claims
0Family size

Assignee

Inventors

Key dates

Filing dateMay 26, 2021
Grant dateMar 5, 2024
Priority date
Expiry dateMay 26, 2041

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L25/51
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Systems and methods are provided for generating and operating a speech enhancement model optimized for generating noise-suppressed speech outputs for improved human listening and live captioning. A computing system obtains a speech enhancement model trained on a first training dataset to generate noise-suppressed speech outputs and an automatic speech recognition model trained on a second training dataset to generate transcription labels for spoken language utterances. A third training dataset comprising a set of spoken language utterances is applied to the speech enhancement model to obtain a first noise-suppressed speech output which is applied to the automatic speech recognition model to generate a noise-suppressed transcription output for the set of spoken language utterances. Speech enhancement model parameters are updated to optimize the speech enhancement model to generate optimized noise-suppressed speech outputs based on a comparison of the noise-suppressed transcription output and ground truth transcription labels.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.