15th European Conference on Artificial Intelligence
|July 21-26 2002 Lyon France|
Istvan Szita, Balint Takacs, Andras Lorincz
Recently a novel reinforcement learning algorithm called event-learning or E-learning was introduced. The algorithm based on events, which are defined as ordered pairs of states. In this setting, the agent optimizes the selection of desired sub-goals by a traditional value-policy function iteration, and utilizes a separated algorithm called the controller to achieve these goals. The advantage of event-learning lies in its potential in non-stationary environments, where the near-optimality of the value iteration is guaranteed by the generalized epsilon-stationary MDP model. Using a particular non-Markovian controller, the SDS controller, an epsilon-MDP problem arises in E-learning. We illustrate the properties of E-learning augmented by the SDS controller by computer simulations.
Keywords: Reinforcement Learning, Machine Learning, Robotics
Citation: Istvan Szita, Balint Takacs, Andras Lorincz: Reinforcement Learning Integrated with a Non-Markovian Controller. In F. van Harmelen (ed.): ECAI2002, Proceedings of the 15th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2002, pp.365-369.