Robust direct speech-to-speech translation
US11960852B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Dec 15, 2021 |
| Grant date | Apr 16, 2024 |
| Priority date | — |
| Expiry date | Feb 21, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L19/16
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A direct speech-to-speech translation (S2ST) model includes an encoder configured to receive an input speech representation that to an utterance spoken by a source speaker in a first language and encode the input speech representation into a hidden feature representation. The S2ST model also includes an attention module configured to generate a context vector that attends to the hidden representation encoded by the encoder. The S2ST model also includes a decoder configured to receive the context vector generated by the attention module and predict a phoneme representation that corresponds to a translation of the utterance in a second different language. The S2ST model also includes a synthesizer configured to receive the context vector and the phoneme representation and generate a translated synthesized speech representation that corresponds to a translation of the utterance spoken in the different second language.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.