Multi-Agent Deep Reinforcement Learning for Walker Systems
Proceedings - 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021
We applied the state-of-art performance Deep Reinforcement Learning (DRL) algorithm, Proximal Policy optimization (PPO), to the minimal robot-legs locomotion for the challenging multi-agent, continuous and high-dimensional state-space environments. The main contribution of this work is identifying the potential factors/hyperparameters and their effects on the performance of the multi-agent settings by varying the number of agents. Based on the comprehensive experiments with 2-10 multi-walkers environments, we found that 1) A minibatch size and a sampling reuse ratio (experience replay buffer size containing multiple minibatches) are critical hyperparameters to improve performance of the PPO; 2) Optimal neural network size depends on the number of walkers in the multi-agent environments; and 3) Parameter sharing among multi-agent is a better training strategy than fully independent learning in terms of comparable performance and improved efficiency with reduced parameters consuming less memory.
Deep Reinforcement Learning (DRL), Multi-agent DRL (MADRL), Proximal Policy optimization (PPO)
Inhee Park and Teng Sheng Moh. "Multi-Agent Deep Reinforcement Learning for Walker Systems" Proceedings - 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021 (2021): 490-495. https://doi.org/10.1109/ICMLA52953.2021.00082