Customizing text-to-speech language models using adapters for conversational AI systems and applications
US12406653B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 13, 2022 |
| Grant date | Sep 2, 2025 |
| Priority date | — |
| Expiry date | Jun 12, 2043 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L13/0335
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
In various examples, one or more text-to-speech machine learning models may be customized or adapted to accommodate new or additional speakers or speaker voices without requiring a full re-training of the models. For example, a base model may be trained on a set of one or more speakers and, after training or deployment, the model may be adapted to support one or more other speakers. To do this, one or more additional layers (e.g., adapter layers) may be added to the model, and the model may be re-trained or updated—e.g., by freezing parameters of the base model while updating parameters of the adapter layers—to generate an adapted model that can support the one or more original speakers of the base model in addition to the one or more additional speakers corresponding to the adapter layers.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.