How to not flatten action mask with Dict observation

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I want to perform action masking in my multi-agent environment using DQN with a custom model.
My multi-agent environment produces observations as follows:

observation = {
    agent_1_key: {
        'observation': {...}  # Dictionary with observations for agent 1
        'action_mask': [1, 1, 0, ...]  # Action mask for agent 1
    },
    agent_2_key: {
        'observation': {...}  # Dictionary with observations for agent 2
        'action_mask': [1, 1, 0, ...]  # Action mask for agent 2
    }
    ... # For each agent
}

My understanding is that, in a custom model, the input_dict of the forward() method will contain 'obs' (the actual observation) and 'obs_flat' (the flattened observation tensor).

However, if I use 'obs_flat' as input to my model, it will contain also the "action_mask" field, and I don’t want that. On the other hand, I cannot use "obs" as input since it is a dictionary.

Do I need to modify the "obs_flat" tensor, flatten the "obs" dictionary with the "action_mask" fields removed, or is there another way to do the job ?

Thank you in advance

1 Like