Patent · US Active

Synthetic data generation for training of natural language understanding models

US11875787B2 · kind B2 · utility

0Cited by

3References

22Claims

0Family size

Assignee

MICROSOFT TECHNOLOGY LICENSING, LLC · US

Inventors

Baolin Peng · Bellevue, US
Chenguang Zhu · Sammamish, US
Chunyuan Li · Beijing, CN
Xiujun Li · Seattle, US
Jinchao Li · Redmond, US
Nanshan Zeng · Bellevue, US
Jianfeng Gao · Woodinville, US

Key dates

Filing date	Oct 11, 2022
Grant date	Jan 16, 2024
Priority date	—
Expiry date	Oct 11, 2042

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/1822
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

This document relates to machine learning. One example includes a method or technique that can be performed on a computing device. The method or technique can include obtaining a task-semantically-conditioned generative model that has been pretrained based at least on a first training data set having unlabeled training examples and semantically conditioned based at least on a second training data set having dialog act-labeled utterances. The method or technique can also include inputting dialog acts into the semantically-conditioned generative model and obtaining synthetic utterances that are output by the semantically-conditioned generative model. The method or technique can also include outputting the synthetic utterances.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.