Shaker, Marwan and Yue, Shigang and Duckett, Tom (2009) Vision-based reinforcement learning using approximate policy iteration. In: 14th International Conference on Advanced Robotics (ICAR), June 22th to 26th, 2009, Munich, Germany..
|Item Type:||Conference or Workshop Item (Paper)|
|Divisions:||College of Science > School of Computer Science|
|Abstract:||A major issue for reinforcement learning (RL) applied to robotics is the time required to learn a new skill. While RL has been used to learn mobile robot control in many simulated domains, applications involving learning on real robots are still relatively rare. In this paper, the Least-Squares Policy Iteration (LSPI) reinforcement learning algorithm and a new model-based algorithm Least-Squares Policy Iteration with Prioritized Sweeping (LSPI+), are implemented on a mobile robot to acquire new skills quickly and efficiently. LSPI+ combines the benefits of LSPI and prioritized sweeping, which uses all previous experience to focus the computational effort on the most “interesting” or dynamic parts of the state space. The proposed algorithms are tested on a household vacuum cleaner robot for learning a docking task using vision as the only sensor modality. In experiments these algorithms are compared to other model-based and model-free RL algorithms. The results show that the number of trials required to learn the docking task is significantly reduced using LSPI compared to the other RL algorithms investigated, and that LSPI+ further improves on the performance of LSPI.|
|Date Deposited:||14 Dec 2009 12:19|
Actions (login required)