Patent · US Active

Systems and methods for unsupervised autoregressive text compression

US11487939B2 · kind B2 · utility

0Cited by
6References
18Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 23, 2019
Grant dateNov 1, 2022
Priority date
Expiry dateFeb 10, 2040

Classification

  • Technology area (CPC H)Electricity
  • CPC primaryH03M7/3059
  • WIPO fieldBasic communication processes
  • WIPO sectorElectrical engineering

Abstract

Embodiments described herein provide a provide a fully unsupervised model for text compression. Specifically, the unsupervised model is configured to identify an optimal deletion path for each input sequence of texts (e.g., a sentence) and words from the input sequence are gradually deleted along the deletion path. To identify the optimal deletion path, the unsupervised model may adopt a pretrained bidirectional language model (BERT) to score each candidate deletion based on the average perplexity of the resulting sentence and performs a simple greedy look-ahead tree search to select the best deletion for each step.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.