I have a multi-agent environment with:
-
Observation space: Nested dict space like
{agent_id: {'obs': observation, 'state': env_state}}
-
Action space: Dict action space with discrete actions like
{agent_id: action}
-
Reward and done: Dicts
{agent_id: reward}
and{agent_id: done}
I want to use QMIX, so in my env creator function I do the following:
def env_creator(config):
env = MyEnv(**config)
env = env.with_agent_groups(
groups= {'group_1': env.agent_ids}, # All agents in same group
observation_space= Tuple([
Dict({
'obs': env.observation_space[agent_id]['obs']
'state': env.observation_space[agent_id]['state']
})
for agent_id in env.agent_ids
]),
action_space= Tuple([
env.action_space[agent_id] for agent_id in env.agent_ids
])
)
return env
- If I run, I receive from QMIX empty actions, i.e.
action = {}
. - Is the above correct? Both the env observation and action space and the env creator function?