Publication Date

Fall 2020

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Engineering

Advisor

Wencen Wu

Keywords

Advection Diffusion Fields, Deep Learning, Path Planning, Reinforcement Learning, Robots

Subject Areas

Computer engineering

Abstract

Many environmental processes can be represented mathematically using spatial-temporal varying partial-differential equations. Timely estimation and prediction of processes such as wildfires is critical for disaster management response, but is difficult to accomplish without the availability of a dense network of stationary sensors. In this work, we propose a deep reinforcement learning-based real-time path-planning algorithm for mobile sensor networks traveling in a formation through a spatial-temporal varying advection-diffusion field for the task of field reconstruction. A deep Q-network (DQN) agent is trained on simulated advection-diffusion fields to direct the mobile sensor network to travel along information-rich trajectories. The field measurements made by the mobile sensor network along their trajectories enable identification of field advection parameters, which are required for field reconstruction. A cooperative Kalman filter developed in previous works is employed to receive estimates of the field values and gradients, which are essential for reconstruction as well as for the estimation of the diffusion parameter. A mechanism is provided that encourages exploration in the field domain once a stationary state is reached, which allows the algorithm to identify other information-rich trajectories that may exist in the field improving reconstruction performance significantly. Two simulation environments of different fidelities are provided to test the feasibility of the proposed algorithm. The low-fidelity simulation environment is used for training of the DQN agent. The high-fidelity simulation environment is based on Robot Operating System (ROS) and simulates real robots. We provide results of running sample test episodes in both environments which demonstrate the effectiveness and feasibility of the proposed algorithm.

Share

COinS