Condition on actions space

Hello community,

I am wondering if there is a way to add a condition in the actions space.
For the context, my agent has 4 possible actions. I want to represent them in the actions space using 4 binary channels, this way:

self.action_space = gymnasium.spaces.Dict(
            {f"agent_{i+1}": gymnasium.spaces.Dict({
                "up": gymnasium.spaces.Discrete(2),
                "down": gymnasium.spaces.Discrete(2),
                "left": gymnasium.spaces.Discrete(2),
                "right": gymnasium.spaces.Discrete(2)
            }) for i in range(nb_agents)}
        )

Is there a certain manner, or maybe another spaces choice, to make the sampling of actions make one and only one action for each agent ? I have already used representing actions with numbers but this is not what I want

self.action_space = gymnasium.spaces.Dict(
            {f"agent_{i+1}": gymnasium.spaces.Discrete(NB_ACTIONS) for i in range(nb_agents)}
        )

Thank you,
Cheers

Hi @Douae_Ahmadoun ,

from your code I only guess that you want to define a multi agent environment. If all your agents are having the same action_space you can define it like:

self.action_space = gymnasium.spaces.Discrete(4)

All agents have then the same action space and can choose between 4 discrete actions: up, down, left, and right.

Take a look into the documentation. And see also here the example for a simple multi agent environment.

Thank you for your replay, @Lars_Simon_Zehnder.

This is somehow what I have already done and want to change.
This is because I don’t want my actions to be represented by numbers (e.g, 1 for up, 2 for down, …). Actually, I want to have masks, one for each possible action with the value 0 or 1, but only one action taking 1 each time.

@Douae_Ahmadoun ,

I doubt that this is possible with the actual gymnasium spaces. The usual way to give masks is by assigning the values from e.g. a Discrete space to directions as shown here for the FrozenLake environment.

If you want to have many actions which can be chosen simultaneously you could use a MultiDiscrete() space like

action_space = gymnasium.spaces.MultiDiscrete([2, 2, 2, 2])
action_space.sample()
# array([0, 1, 0, 1])

I see, thank you @Lars_Simon_Zehnder.