Value based methods compatible with multi-discrete action space?

Is any of the value based Deep RL algorithms in the rllib like DQN compatible with the multi-discrete action space? I tried a few and all of them were incompatible. Is there any workaround for this to handle a multi-discrete action space with DQN?

I saeid93,

The discrete action space is not supported.
How large is your action space? You can turn a multi-discrete action space into a discrete one by letting an action in the discrete space be any legal permutation of combined actions from the multi-discrete one. This exponentially increases your action space size.
If your action space is small and you decide that this is feasible: You need to wrap your environment yourself in order to do that.


  • EDIT: The above counts for DQN. MultiDiscrete action spaces are obviously supported by other algorithms.

@arturn thank you for your answer, yes that seems to be a nice way of fixing it, however the action space is huge MultiDiscrete([8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8]) and I doubt that will be possible? Do you think it will work?

I think you should consider switching to another algorithm. Without ever having used DQN on such a large action space, I presume that DQN will fail here. Maybe someone else has more experience in getting your problem to work. Have you considered a policy gradient algorithm? Cheers