Unsupervised cross-domain data augmentation for long-document based prediction and explanation
US12321841B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Oct 24, 2022 |
| Grant date | Jun 3, 2025 |
| Priority date | — |
| Expiry date | Jun 17, 2043 |
Classification
- Technology area (CPC H)Electricity
- CPC primaryH03M7/3082
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Unsupervised cross-domain data augmentation techniques for long-text document based prediction and explanation are provided. In one aspect, a system for long-document based prediction includes: an encoder for creating embeddings of long-document texts with hierarchical sparse self-attention, and making predictions using the embeddings of the long-document texts; and a multi-source counterfactual augmentation module for generating perturbed long-document texts using unlabeled sentences from at least one external source to train the encoder. A method for long-document based prediction is also provided.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.