Hello community,
I am wondering if there is a way to add a condition in the actions space.
For the context, my agent has 4 possible actions. I want to represent them in the actions space using 4 binary channels, this way:
self.action_space = gymnasium.spaces.Dict(
{f"agent_{i+1}": gymnasium.spaces.Dict({
"up": gymnasium.spaces.Discrete(2),
"down": gymnasium.spaces.Discrete(2),
"left": gymnasium.spaces.Discrete(2),
"right": gymnasium.spaces.Discrete(2)
}) for i in range(nb_agents)}
)
Is there a certain manner, or maybe another spaces choice, to make the sampling of actions make one and only one action for each agent ? I have already used representing actions with numbers but this is not what I want
self.action_space = gymnasium.spaces.Dict(
{f"agent_{i+1}": gymnasium.spaces.Discrete(NB_ACTIONS) for i in range(nb_agents)}
)
Thank you,
Cheers
Hi @Douae_Ahmadoun ,
from your code I only guess that you want to define a multi agent environment. If all your agents are having the same action_space
you can define it like:
self.action_space = gymnasium.spaces.Discrete(4)
All agents have then the same action space and can choose between 4 discrete actions: up
, down
, left
, and right
.
Take a look into the documentation. And see also here the example for a simple multi agent environment.
Thank you for your replay, @Lars_Simon_Zehnder.
This is somehow what I have already done and want to change.
This is because I don’t want my actions to be represented by numbers (e.g, 1 for up, 2 for down, …). Actually, I want to have masks, one for each possible action with the value 0 or 1, but only one action taking 1 each time.
@Douae_Ahmadoun ,
I doubt that this is possible with the actual gymnasium spaces. The usual way to give masks is by assigning the values from e.g. a Discrete
space to directions as shown here for the FrozenLake
environment.
If you want to have many actions which can be chosen simultaneously you could use a MultiDiscrete()
space like
action_space = gymnasium.spaces.MultiDiscrete([2, 2, 2, 2])
action_space.sample()
# array([0, 1, 0, 1])