Patent · US Active

Rule creation using MDP and inverse reinforcement learning

US11003998B2 · kind B2 · utility

0Cited by

1References

14Claims

0Family size

Assignee

International Business Machines Corporation · US

Inventors

Akira Koseki · Sagamihara, JP
Tetsuro Morimura · Tokyo, JP
Toshiro Takase · Urayasu, JP
Hiroki Yanagisawa · Tokyo, JP

Key dates

Filing date	Nov 14, 2017
Grant date	May 11, 2021
Priority date	—
Expiry date	Jan 14, 2040

Classification

Technology area (CPC G)Physics
CPC primaryG05D1/0088
WIPO fieldComputer technology
WIPO sectorElectrical engineering

Abstract

A method is provided for rule creation that includes receiving (i) a MDP model with a set of states, a set of actions, and a set of transition probabilities, (ii) a policy that corresponds to rules for a rule engine, and (iii) a set of candidate states that can be added to the set of states. The method includes transforming the MDP model to include a reward function using an inverse reinforcement learning process on the MDP model and on the policy. The method includes finding a state from the candidate states, and generating a refined MDP model with the reward function by updating the transition probabilities related to the state. The method includes obtaining an optimal policy for the refined MDP model with the reward function, based on the reward policy, the state, and the updated probabilities. The method includes updating the rule engine based on the optimal policy.

Source: USPTO / EPO open patent data. Objective bibliographic and citation counts.