Patent · US Active

Controllable, natural paralinguistics for text to speech synthesis

US12361925B2 · kind B2 · utility

0Cited by
3References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateDec 29, 2020
Grant dateJul 15, 2025
Priority date
Expiry dateDec 29, 2040

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG10L15/26
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A speech recognition module receives training data of speech and creates a representation for individual words, non-words, phonemes, and any combination. A set of speech processing detectors analyze the training data of speech from humans communicating. The set of speech processing detectors detect speech parameters that are indicative of paralinguistic effects on top of enunciated words, phonemes, and non-words in the audio stream. One or more machine learning models undergo supervised machine learning on their neural network to train on how to associate one or more mark-up markers with a textual representation, for each individual word, individual non-word, individual phoneme, and any combinations of these, that was enunciated with a particular paralinguistic effect. Each mark-up marker can correspond to its own paralinguistic effect.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.