Patent · US Active

Out-of-domain data augmentation for natural language processing

US12293155B2 · kind B2 · utility

0Cited by

6References

20Claims

0Family size

Assignee

Oracle International Corporation · US

Inventors

Elias Luqman Jalaluddin · Seattle, US
Vishal Vishnoi · Redwood City, US
Thanh Long Duong · Melbourne, AU
Mark Edward Johnson · Chatswood, AU
Poorya Zaremoodi · Melbourne, AU
Gautam Singaraju · Dublin, US
Ying Xu · Albion, AU
Vladislav Blinov · Melbourne, AU
Yu-Heng Hong · Melbourne, AU

Key dates

Filing date	Apr 9, 2024
Grant date	May 6, 2025
Priority date	—
Expiry date	Apr 9, 2044

Classification

Technology area (CPC H)Electricity
CPC primaryH04L51/02
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, and augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: generating a data set of OOD examples, filtering out OOD examples from the data set of OOD examples, determining a difficulty value for each OOD example remaining within the filtered data set of the OOD examples, and generating augmented batches of utterances including utterances from the training set of utterances and utterances from the filtered data set of the OOD based on the difficulty value for each OOD. Thereafter, the machine-learning model is trained using the augmented batches of utterances in accordance with a curriculum training protocol.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.