Patent · US Active

Unsupervised cross-domain data augmentation for long-document based prediction and explanation

US12321841B2 · kind B2 · utility

0Cited by
2References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateOct 24, 2022
Grant dateJun 3, 2025
Priority date
Expiry dateJun 17, 2043

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH03M7/3082
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

Unsupervised cross-domain data augmentation techniques for long-text document based prediction and explanation are provided. In one aspect, a system for long-document based prediction includes: an encoder for creating embeddings of long-document texts with hierarchical sparse self-attention, and making predictions using the embeddings of the long-document texts; and a multi-source counterfactual augmentation module for generating perturbed long-document texts using unlabeled sentences from at least one external source to train the encoder. A method for long-document based prediction is also provided.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.