Publication Date

Spring 5-18-2020

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Robert Chun

Second Advisor

Jon Pearce

Third Advisor

Chris Tseng


Reinforcement Learning (RL) is a machine learning technique where an agent learns

to perform a complex action by going through a repeated process of trial and error to maximize a well-defined reward function. This form of learning has found applications in robot locomotion where it has been used to teach robots to traverse complex terrain. While RL algorithms may work well in training robot locomotion, they tend to not generalize well when the agent is brought into an environment that it has never encountered before. Possible solutions from the literature include training a destabilizing adversary alongside the locomotive learning agent. The destabilizing adversary aims to destabilize the agent by applying external forces to it, which may help the locomotive agent learn to deal with unexpected scenarios. For this project, we will train a robust, simulated quadruped robot to traverse a variable terrain. We compare and analyze Proximal Policy Optimization (PPO) with and without the use of an adversarial agent, and determine which use of PPO produces the best results.