Patent · US Active

Controllable, natural paralinguistics for text to speech synthesis

US12361925B2 · kind B2 · utility

0Cited by

3References

20Claims

0Family size

Assignee

SRI International · US

Inventors

Harry Bratt · Mountain View, US
Colleen Richey · Foster City, US
Maneesh Yadav · Menlo Park, US

Key dates

Filing date	Dec 29, 2020
Grant date	Jul 15, 2025
Priority date	—
Expiry date	Dec 29, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/26
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A speech recognition module receives training data of speech and creates a representation for individual words, non-words, phonemes, and any combination. A set of speech processing detectors analyze the training data of speech from humans communicating. The set of speech processing detectors detect speech parameters that are indicative of paralinguistic effects on top of enunciated words, phonemes, and non-words in the audio stream. One or more machine learning models undergo supervised machine learning on their neural network to train on how to associate one or more mark-up markers with a textual representation, for each individual word, individual non-word, individual phoneme, and any combinations of these, that was enunciated with a particular paralinguistic effect. Each mark-up marker can correspond to its own paralinguistic effect.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.