Patent · US Active

Self-play to improve task-oriented dialog systems and methods

US12026544B2 · kind B2 · utility

0Cited by

0References

18Claims

0Family size

Assignee

BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD. · CN

Inventors

Kevin Knight · Grosse Isle, CA
Mariia Ryskina · Pittsburgh, US
Arkady Arkhangorodsky · Los Angeles, US
Ajay Nagesh · Los Angeles, US
Scot FANG · Los Angeles, US

Key dates

Filing date	Nov 25, 2020
Grant date	Jul 2, 2024
Priority date	—
Expiry date	Apr 14, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L2015/223
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

An automatic agent may be trained using reinforcement learning. A secret task may be obtained for a simulated user, and the secret task may be unknown to the automatic agent. At least one instruction to complete the secret task may be obtained from the simulated user according to at least one RL policy. At least one action may be generated by the automatic agent based on the at least one instruction and the at least one RL policy. Rewards may be determined for the simulated user and the automatic agent in response to determining that the at least one action successfully completes the secret task. The at least one RL policy may be adjusted based on the determined rewards.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.