[rllib] Dict Action Space and Custom Model

sven1977 · March 30, 2021, 3:03pm

When using dict action spaces, your model should output a flat tensor, which will then be passed into a MultiActionDistribution for action sampling. This sampling step then returns a dict.
The alphabetic sorting is potentially a problem, however, it’s forced upon RLlib via gym’s very own Dict space handling (Dict.spaces is an OrderedDict).

If you check the code in MultiActionDistribution (ray/torch_action_dist.py at master · ray-project/ray · GitHub), you will see that we create an alphabetically sorted action_space_struct dict, which we then use to regenerate the action dict from your flat tensor outputs.

In other words, as long as you return from your model a tensor that is sorted alphabetically according your dict (print out self.action_space_struct in the MultiActionDistribution to see what the exact order should be in case you have additional nesting going on), it’ll be fine.
Alternatively, you can use a custom action distribution, which then would handle your model’s output (whatever that would be, e.g. a dict), but then you would be responsible for the “handover” between model and action distribution.

Topic		Replies	Views
Continuous action space and custom model RLlib	4	1546	July 17, 2021
Undestanding the expected output shapes of a Recurrent model with Dict Action Space Configure Algorithm, Training, Evaluation, Scaling	2	295	January 15, 2024
Action masking & Dict observation space & 'avail_actions'? Configure Algorithm, Training, Evaluation, Scaling	1	1052	August 4, 2023
Value of num_outputs of DQNTrainer RLlib	3	533	May 9, 2022
Flatten observation space (dictionary) in parametric actions RLlib	2	875	July 30, 2021

[rllib] Dict Action Space and Custom Model

Related topics