Noise data augmentation for natural language processing
US11538457B2 · kind B2 · utility
Assignee
Inventors
Key dates
| Filing date | Sep 9, 2020 |
| Grant date | Dec 27, 2022 |
| Priority date | — |
| Expiry date | Feb 21, 2041 |
Classification
- Technology area (CPC G)Physics
- CPC primaryG10L2015/227
- WIPO fieldComputer technology
- WIPO sectorElectrical engineering
Abstract
Techniques for noise data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training an intent classifier to identify one or more intents for one or more utterances; augmenting the training set of utterances with noise text to generate an augmented training set of utterances; and training the intent classifier using the augmented training set of utterances. The augmenting includes: obtaining the noise text from a list of words, a text corpus, a publication, a dictionary, or any combination thereof irrelevant of original text within the utterances of the training set of utterances, and incorporating the noise text within the utterances relative to the original text in the utterances of the training set of utterances at a predefined augmentation ratio to generate augmented utterances.
Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.