Patent · US Active

Synthesizing speech recognition training data

US11308938B2 · kind B2 · utility

3Cited by

0References

7Claims

0Family size

Assignee

SoundHound, Inc. · US

Inventors

Maisy Wieman · Boulder, US
Jonah Probell · Milpitas, US
Sudharsan Krishnaswamy · San Jose, US

Key dates

Filing date	Dec 5, 2019
Grant date	Apr 19, 2022
Priority date	—
Expiry date	May 7, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L13/00
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

To train a speech recognizer, such as for recognizing variables in a neural speech-to-meaning system, compute, within an embedding space, a range of vectors of features of natural speech. Generate parameter sets for speech synthesis and synthesis speech according to the parameters. Analyze the synthesized speech to compute vectors in the embedding space. Using a cost function that favors an even spread (minimal clustering) generates a multiplicity of speech synthesis parameter sets. Using the multiplicity of parameter sets, generate a multiplicity of speech of known words that can be used as training data for speech recognition.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.