From Evaluating to Teaching: Rewards and Challenges of Human Control for Learning Robots

Senft, Emmanuel and Lemaignan, Severin and Baxter, Paul and Belpaeme, Tony (2018) From Evaluating to Teaching: Rewards and Challenges of Human Control for Learning Robots. In: IROS 2018 Workshop on Human/Robot in the Loop Machine Learning, 1st October 2018, Madrid, Spain.

Full content URL: http://wp.doc.ic.ac.uk/bbl/wp-content/uploads/site...

Documents
From Evaluating to Teaching: Rewards and Challenges of Human Control for Learning Robots
Published PDF
[img]
[Download]
[img] PDF
18 HRML Emmanuel.pdf - Whole Document

583kB
Item Type:Conference or Workshop contribution (Paper)
Item Status:Live Archive

Abstract

Keeping a human in a robot learning cycle can provide many advantages to improve the learning process. However, most of these improvements are only available when the human teacher is in complete control of the robot’s behaviour, and not just providing feedback. This human control can make the learning process safer, allowing the robot to learn in high-stakes interaction scenarios especially social ones. Furthermore, it allows faster learning as the human guides the robot to the relevant parts of the state space and can provide additional information to the learner. This information can also enable the
learning algorithms to learn for wider world representations, thus increasing the generalisability of a deployed system. Additionally, learning from end users improves the precision of the final policy as it can be specifically tailored to many situations. Finally, this progressive teaching might create trust between the learner and the teacher, easing the deployment of the autonomous robot. However, with such control comes a range of challenges. Firstly, the rich communication between the robot and the teacher needs to be handled by an interface, which may require complex features. Secondly, the teacher needs to be embedded within the robot action selection cycle, imposing time constraints, which increases the cognitive load on the teacher. Finally, given a cycle of interaction between the robot and the teacher, any mistakes made by the teacher can be propagated to the robot’s policy. Nevertheless, we are are able to show that empowering the teacher with ways to control a robot’s behaviour has the potential to drastically improve both the learning process (allowing robots to learn in a wider range of environments) and the experience of the teacher.

Keywords:Human-robot interaction, interactive reinforcement learning, interactive machine learning, educational robotics, Robotics
Subjects:G Mathematical and Computer Sciences > G700 Artificial Intelligence
G Mathematical and Computer Sciences > G440 Human-computer Interaction
G Mathematical and Computer Sciences > G760 Machine Learning
Divisions:College of Science > School of Computer Science
Related URLs:
ID Code:36200
Deposited On:19 Jun 2019 08:52

Repository Staff Only: item control page