Patent · US Active

Online temporal difference learning from incomplete customer interaction histories

US9367820B2 · kind B2 · utility

6Cited by

0References

21Claims

0Family size

Assignee

NICE SYSTEMS TECHNOLOGIES UK LIMITED · GB

Inventors

Leonard Michael Newnham · London, GB
Jason Derek McFall · London, GB
David James Barker · Faringdon, GB
David Silver · Hitchin, GB

Key dates

Filing date	Dec 16, 2014
Grant date	Jun 14, 2016
Priority date	—
Expiry date	Dec 16, 2034

Classification

Technology area (CPC G)Physics
CPC primaryG06N5/04
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

In one embodiment, an indication that a decision has been requested, selected, or applied with respect to one or more users may be obtained. After the indication that a decision that has been requested, selected, or applied is obtained, a value function may be updated, where the value function approximates an expected reward associated with the one or more users over time since the decision has been requested, selected, or applied with respect to the one or more users. The value function may be updated by performing or providing one or more updates to the value function, where a time at which each of the one or more updates is performed or provided is independent of activity of the one or more users.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.