Patent · US Active

Transformer-based neural network including a mask attention network

US12260338B2 · kind B2 · utility

0Cited by
0References
20Claims
0Family size

Assignee

Inventors

Key dates

Filing dateAug 27, 2020
Grant dateMar 25, 2025
Priority date
Expiry dateJun 9, 2043

Classification

  • Technology area (CPC G)Physics
  • CPC primaryG06N3/09
  • WIPO fieldComputer technology
  • WIPO sectorElectrical engineering

Abstract

A transformer-based neural network includes at least one mask attention network (MAN). The MAN computes an original attention data structure that expresses influence between pairs of data items in a sequence of data items. The MAN then modifies the original data structure by mask values in a mask data structure, to produce a modified attention data structure. Compared to the original attention data structure, the modified attention data structure better accounts for the influence of neighboring data items in the sequence of data items, given a particular data item under consideration. The mask data structure used by the MAN can have static and/or machine-trained mask values. In one implementation, the transformer-based neural network includes at least one MAN in combination with at least one other attention network that does not use a mask data structure, and at least one feed-forward neural network.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.