Patent · US Active

Reinforcement learning techniques for dialogue management

US12271703B2 · kind B2 · utility

0Cited by

1References

20Claims

0Family size

Assignee

PAYPAL, INC. · US

Inventor

Rajesh Munavalli · San Jose, US

Key dates

Filing date	Sep 29, 2021
Grant date	Apr 8, 2025
Priority date	—
Expiry date	Dec 7, 2043

Classification

Technology area (CPC G)Physics
CPC primaryG10L15/22
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

Techniques are disclosed herein relating to using reinforcement learning to generate a dialogue policy. A computer system may perform an iterative training operation to train a deep Q-learning network (DQN) based on conversation logs from prior conversations. In various embodiments, the DQN may include an input layer to receive an input value indicative of a current state of a given conversation, one or more hidden layers, and an output layer that includes a set of nodes corresponding to available responses. During the iterative training operation, the disclosed techniques may analyze utterances from a conversation log and, based on the utterances, use the DQN to determine appropriate responses. Reward values may be determined based on the selected responses and, based on the reward values, the DQN may be updated. Once generated, the dialogue policy may be used by a chatbot system to guide conversations with users.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.