Rainbow/DQN with MultiDiscrete Action Spaces

iceboy910447 · May 23, 2021, 4:40pm

Hello,

I’m currently trying to train a MultiAgentEnvironment with a MultiDiscrete Action Space. Since a multidiscrete action space would be preferable over a continous action space in this simulation, i’d like to ask if it would be possible to add MultiDiscrete Action spaces to the supported Spaces for Rainbow/DQN.

I know so far that quite a few people tried to use MultiDiscrete Action Spaces, but either flattend the Action Space or used PPO instead. Since a simple flattening of the action space would massivly increase the size of the output layer of the network, since every possible combination of Discrete Actions would get it’s own Q-value, this approach would most likely increase both the training time as well as the required network complexity.

This would not only help in my special usecase, but also help to evaluate the performance of Rainbow compared to PPO in several example usecases.

I hope anyone can help me to make it possible to train my simulation with Rainbow, or point me to parts of the code that i could change to make it work.

Thanks

sven1977 · May 24, 2021, 10:03am

Hey @iceboy910447 , we currently don’t have plans to expand our Q-learning algos to run on any non-discrete spaces (cont. and/or complex, like Dict, Tuple, MultiDiscrete). Yes, the only solution right now would be to flatten/discretize your action space, but in case you have a very complex action space, this would probably lead to huge output heads in your model and therefore probably reduced learning performance.

sven1977 · May 24, 2021, 10:05am

Feel free to provide a fix for that, though, and PR. There are tons of cool papers out there that describe such fixes for DNQ, whether it be to allow for cont. actions or vast discrete spaces (1000s of discrete actions):

or

Topic		Replies	Views
Value based methods compatible with multi-discrete action space? RLlib	3	618	December 21, 2021
Is Rainbow/DQN really usable with parametric action spaces? RLlib	4	672	October 7, 2022
Passing extra action information to the environment (DQN) RLlib	0	324	June 29, 2022
Discrete tuple action space for simple Q RLlib	4	1299	October 14, 2021
How tu solve env with very large action space RLlib	7	403	April 14, 2022

Rainbow/DQN with MultiDiscrete Action Spaces

Related topics