Adam White

PhD Student
Department of Computing Science
University of Alberta















amw8@ualberta.ca
(780) 906-4587

Research

Reinforcement Learning, Robotics, Knowledge Representation and Intrinsic Motivation

How to learn and represent grounded knowledge of a physically embodied agent is a key open problem in artificial intelligence (AI). This problem is challenging because grounding knowledge often necessitates learning at a fine-time scale from low-level sensorimotor data like IR distance sensors and motor velocities. There is also a significant computational challenge. The agent has limited resources with which to learn a large amount of knowledge about a high-dimensional, non-stationary environment. Many of the languages proposed for representing agent knowledge trade-off expressiveness with learnability. Expressive languages, like first order logic, Bayes networks and fuzzy logic can represent a wide range of knowledge, about people, objects and relationships but are computationally expensive to update and often require supervision. Differential equation models (from operations research) and transition matrices (from reinforcement learning), on the other hand, can be learned efficiently from data but are much less expressive. These low-level models cannot directly represent abstract, temporally-extended predictions and require significant prior knowledge. There remains room for exploring new approaches to learning expressive knowledge from experience generated by an autonomous robot.

The goal of my doctoral research is to build a mobile robot that can learn a significant amount of knowledge from its interaction with the world: a demonstration of a complete working system on a robot. To achieve this goal I will design and build an architecture that integrates: (1) representing and updating knowledge about the world, (2) generating behaviour and (3) maintaining internal state of the robot. I will use a new approach to knowledge representation based on value functions and on other ideas and algorithms from reinforcement learning. The scope of my research will include how to scale the system, given this representation of knowledge, by adapting the behaviour generation and state representation update procedures based on experience generated by the robot. My architecture will be the first real-time architecture that can learn flexible, general purpose knowledge about the world directly from sensorimotor data generate by the Critterbot.


The Critterbot


Papers

[1] Martha White and Adam White. (2010) Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains, Neural Information Processing Systems Conference.

[2] J. Modayil, P.M. Pilarski, A. White, T. Degris, and R.S. Sutton, Off-Policy Knowledge Maintenance for Robots, Robotics: Science and Systems Workshop — Towards Closing the Loop: Active Learning for Robotics (Zaragoza, Spain, June 27-30, 2010, 2 page extended abstract and poster)

[3] Shimon Whiteson, Brian Tanner and Adam White. (2010) The Reinforcement Learning Competitions, Artifical Intelligence Magazine, Summer Issuse 2010

[4] Brian Tanner and Adam White. (2009) RL-Glue : Language-Independent Software for Reinforcement-Learning Experiments, Journal of Machine Learning Research.

[5] Adam M. White and Elliot Ludvig. (2008) Using shaping to speed up reinforcement learning on complex tasks, 22nd Annual Joseph R. Royce Research Conference.

[6] Adam M. White. (2006) A Standard System for Benchmarking in Reinforcement Learning, MSc Thesis, University of Alberta.

[7] Nathan Sturtevant and Adam White. (2006) Feature Construction for Reinforcement Learning in Hearts. 5th Inter. Conference on Computers and Games.

[8] Adam M. White and Richard S. Sutton. (2005) RL-Glue: NIPS RLBB and Beyond, Advances in Neural Information Processing Systems 19 (NIPS) workshop: Reinforcement Learning Benchmarks and Bakeoffs II. 1: 50-52

[9] R. Shaw, L. Garey and A. White. (2005) Adapting Root Finding Methods for Discrete Searching, Inter. Conference on Statistics, Mathematics and Related Fields.

[10] R. Shaw, L. Garey and A. White. (2004) A Parallel QR Factorization Algorithm for Solving Toeplitz Tridiagonal Systems, Proceedings of the 18th Inter. Parallel and Distributed Processing Symposium. 18: 235-242

Contact info

Office: CSC 3-05 (Computing Science Center), (780) 492-????,

Mail:
Department of Computing Science
University of Alberta
Edmonton, Alberta T6G 2E8
Canada