How tu solve env with very large action space

Peter_Pirog · April 5, 2022, 8:00pm

I would like to solve env with relatively large action space, the action is a position in 2D np.array with size (1000, 200) so it makes 200000 possible actions (outputs of neural net).
I think if I can define somehow action as number of column and number of row it makes only 1200 values as output of neural network, is possible to do it?
I analyzed example parametric_actions_cartpole.py but I.m not sure how to do it ? Is it connected somehow with list of action embeddings ?
I understand how to interpret idea of observation encoding but idea of action encoding and the usage is uncelar for me.
So is there any example how to define and use actions with two integers as output (number of column and number of row)?

arturn · April 13, 2022, 1:50pm

Hi @Peter_Pirog ,

If you want your actions to be selected from the same space, i.e. 0…1200, this won’t make your space smaller if you select two actions on each timestep.
Here are two ideas:

Sometimes large discrete actions spaces are not “as discrete” as they sound. Meaning: If adjacent actions are very similar and not completely different decisions, you can try modelling it as a continuous space and simply round your action outputs to discrete numbers.
You can still try this with one of the fancy algorithms. Plain DQN will likely not help you here but you can still give rainbow a shot here.

Cheers

arturn · April 13, 2022, 1:54pm

Any action space that is multi discrete relates to what you are describing here. The two_step_example_game features this.

Peter_Pirog · April 13, 2022, 10:01pm

@arturn Thank You for the answer I will check it.

evo11x · April 13, 2022, 10:04pm

With only 50 discrete actions rllib was slower 20 to 50 times (on Windows 11) compared to Box action, so I ended making the actions as Box and rounding the numbers.

evo11x · April 13, 2022, 10:09pm

Sorry, it was MultiDiscrete (50 actions each with values from 1 to 20)

arturn · April 14, 2022, 9:39am

Hi @evo11x,

Can you elaborate on this?
Maybe provide a script and the exact change you made that lead you to a slowdown of 20x-50x?

Thanks

evo11x · April 14, 2022, 9:35pm

Just add an action space like this to any environment and you should see the slowdown

self.action_space = spaces.MultiDiscrete(np.full((50), 50))

compared to this
self.action_space = spaces.Box(low=-1, high=1, shape=(50,), dtype=np.float32)

The slowdown was huge on Windows 11 and on Ubuntu it was slightly faster, but not by much.

Topic		Replies	Views
Action space with multiple output? RLlib	7	1159	July 14, 2022
Continuous action space and custom model RLlib	4	1504	July 17, 2021
Rllib extremely complex action space Possible? RLlib	1	258	May 4, 2022
Initial action for Dict action space RLlib	5	1311	July 23, 2021
Rainbow/DQN with MultiDiscrete Action Spaces RLlib	2	2399	May 24, 2021

How tu solve env with very large action space

Related topics