Patent · US Active

Multilingual speech translation with adaptive speech synthesis and adaptive physiognomy

US11545134B1 · kind B1 · utility

12Cited by

0References

20Claims

0Family size

Assignee

AMAZON TECHNOLOGIES, INC. · US

Inventors

Marcello Federico · Georgetown, US
Robert Enyedi · Santa Clara, US
Yaser Al-Onaizan · Whitehall Corners, US
Roberto Barra-Chicote · Cambridge, GB
Andrew Paul Breen · Norwich, GB
Ritwik Giri · Sunnyvale, US
Mehmet Umut Isik · Menlo Park, US
Arvindh Krishnaswamy · San Jose, US
Hassan Sawaf · San Jose, US

Key dates

Filing date	Dec 10, 2019
Grant date	Jan 3, 2023
Priority date	—
Expiry date	Mar 3, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG10L25/90
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Techniques for the generation of dubbed audio for an audio/video are described. An exemplary approach is to receive a request to generate dubbed speech for an audio/visual file; and in response to the request to: extract speech segments from an audio track of the audio/visual file associated with identified speakers; translate the extracted speech segments into a target language; determine a machine learning model per identified speaker, the trained machine learning models to be used to generate a spoken version of the translated, extracted speech segments based on the identified speaker; generate, per translated, extracted speech segment, a spoken version of the translated, extracted speech segments using a trained machine learning model that corresponds to the identified speaker of the translated, extracted speech segment and prosody information for the extracted speech segments; and replace the extracted speech segments from the audio track of the audio/visual file with the spoken versions spoken version of the translated, extracted speech segments to generate a modified audio track.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.