Publication Date
Spring 2018
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
Abstract
Path Finding is a vastly studied subject in the field of Computer Science. The problem of path-finding is defined as the discovery and plotting of an optimal route between two points on a plane. The existing algorithms that solve this problem are mostly static and rely heavily on the prior knowledge of the environment. They also require the environment to be deterministic. However, in real-world applications of the path-finding problem, often the environment is priorly unknown and stochastic, and with several conflicting objectives. In such cases, the aforementioned algorithms fail to produce effective results. In this project, we study and use a reinforcement learning approach for solving the many-objective path-finding problem, called Voting Q-Learning (VoQL), a model-free, on-policy learning algorithm. In this project, a set of optimal policies is determined with the help of the VoQL algorithm. This algorithm uses various voting methods borrowed from the field of social choice theory for action-selection. In addition to working with the existing methods for VOQL, the performance of additional voting methods is studied and evaluated for the first time.
Recommended Citation
Thombre, Prashant, "Multi-objective Path Finding Using Reinforcement Learning" (2018). Master's Projects. 643.
DOI: https://doi.org/10.31979/etd.2ntb-4j8q
https://scholarworks.sjsu.edu/etd_projects/643
Comments
Path Finding is a vastly studied subject in the field of Computer Science. The problem of path-finding is defined as the discovery and plotting of an optimal route between two points on a plane. The existing algorithms that solve this problem are mostly static and rely heavily on the prior knowledge of the environment. They also require the environment to be deterministic. However, in real-world applications of the path-finding problem, often the environment is priorly unknown and stochastic, and with several conflicting objectives. In such cases, the aforementioned algorithms fail to produce effective results. In this project, we study and use a reinforcement learning approach for solving the many-objective path-finding problem, called Voting Q-Learning (VoQL), a model-free, on-policy learning algorithm. In this project, a set of optimal policies is determined with the help of the VoQL algorithm. This algorithm uses various voting methods borrowed from the field of social choice theory for action-selection. In addition to working with the existing methods for VOQL, the performance of additional voting methods is studied and evaluated for the first time.