Examples of using multiple (simultaneous) actions?

reagan · February 11, 2021, 2:14am

I have a actor critic network that has to choose multiple actions simultaneously. They are specific values (discrete actions) and not continuous values. So I use gym.spaces.MultiDiscrete to group them.
I see that RLlib has MultiActionDistribution/TorchMultiActionDistribution, MultiCategorical classes. (from ray.rllib.models.torch.torch_action_dist). I could not find any dqn or actor-critic examples that uses these classes.

To be more precise, I want to know what would be the output of the Actor network (in a multi action problem) ? Usually Actor network outputs the total number of actions and one of them is chosen for a single action problem. In a multi action problem I suppose I will have to create every single possible combinations of actions and the Actor network would output total combinations of actions (for example MultiDiscrete([4,3,5]) is 4 * 3 * 5=60 outputs). But in my case the combination total is several hundreds. I would like to know if RLlib can handle such multi action problems much easily using the classes I mentioned above.

sven1977 · February 12, 2021, 9:56am

For PG-type algorithms ((A)PPO, A3C, PG), this should not be a problem and you can use a MultiDiscrete action space, even without having to “flatten” it into a Discrete one.

You can check our test case for this: ray/rllib/tests/test_supported_spaces.py::TestSupportedSpacesPG
where we test all the above algos for different action spaces. You can even use Dict/Tuple action spaces with these.

Topic		Replies	Views
Value based methods compatible with multi-discrete action space? RLlib	3	618	December 21, 2021
Is any multi discrete action example for PPO or other algorithms? RLlib	9	4320	January 29, 2023
Action space for choosing a sequence of items from a bigger sequence RLlib	6	314	March 1, 2023
Action masking for dependent multi discrete space Configure Algorithm, Training, Evaluation, Scaling	0	467	August 3, 2023
How tu solve env with very large action space RLlib	7	403	April 14, 2022

Examples of using multiple (simultaneous) actions?

Related topics