Cross-lingual zero-shot transfer via semantic and synthetic representation learning
US12050870B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 1, 2021 |
| Grant date | Jul 30, 2024 |
| Priority date | — |
| Expiry date | Aug 25, 2042 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG06N3/096
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
A computer-implemented method is provided for cross-lingual transfer. The method includes randomly masking a source corpus and a target corpus to obtain a masked source corpus and a masked target corpus. The method further includes tokenizing, by pretrained Natural Language Processing (NLP) models, the masked source corpus and the masked target corpus to obtain source tokens and target tokens. The method also includes transforming the source tokens and the target tokens into a source dependency parsing tree and a target dependency parsing tree. The method additionally includes inputting the source dependency parsing tree and the target dependency parsing tree into a graph encoder pretrained on a translation language modeling task to extract common language information for transfer. The method further includes fine-tuning the graph encoder and a down-stream network for a specific NLP down-stream task.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.