Publication Date

Fall 2018

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Chris Pollett

Second Advisor

Robert Chun

Third Advisor

Katerina Potika


Reinforcement Learning, Trust Region Policy Optimisation, Rock Climbing


Reinforcement Learning (RL) is a field of Artificial Intelligence that has gained a lot of attention in recent years. In this project, RL research was used to design and train an agent to climb and navigate through an environment with slopes. We compared and evaluated the performance of two state-of-the-art reinforcement learning algorithms for locomotion related tasks, Deep Deterministic Policy Gradients (DDPG) and Trust Region Policy Optimisation (TRPO). We observed that, on an average, training with TRPO was three times faster than DDPG, and also much more stable for the locomotion control tasks that we experimented. We conducted experiments and finally designed an environment using insights from transfer learning to successfully train an agent to climb slopes up to 36°.