Multi-agent configuration incompatible with Ray hyperparam tuning

@sven1977 @fedetask
In the case of pettingzoo envs, it seems impossible to use different spaces for different agents with the obs/action_space inference from the env:
Master pettingzoo env code: https://github.com/ray-project/ray/blob/master/rllib/env/wrappers/pettingzoo_env.py

        # Get first observation space, assuming all agents have equal space
        self.observation_space = self.par_env.observation_space(self.par_env.agents[0])

        # Get first action space, assuming all agents have equal space
        self.action_space = self.par_env.action_space(self.par_env.agents[0])

        assert all(
            self.par_env.observation_space(agent) == self.observation_space
            for agent in self.par_env.agents
        ), (
            "Observation spaces for all agents must be identical. Perhaps "
            "SuperSuit's pad_observations wrapper can help (useage: "
            "`supersuit.aec_wrappers.pad_observations(env)`"
        )

And also as of today, the MultiAgentDict action/obs spaces inference from MultiAgentEnv is not working as it should when dealing with different agents having differents spaces. That’s why petingzooEnv use a hacky way to retrieve the spaces from the first agent only and use it for all agents.
But is should be possible to define a MultiAgentDict of spaces, lot of work has already been done for it in the MultiAgentEnv.py

Please have a look at my post, I just updated it : https://discuss.ray.io/t/multiagents-type-actions-observation-space-defined-in-environement/5120/5

I’ll keep an eye on this post too, I’m having a similar use case

Edit: This PR will make the multi agent env spaces inferences working, without providing spaces in the config policySpec. [RLlib] Discussion 6060 and 5120: auto-infer different agents' spaces in multi-agent env. by sven1977 · Pull Request #24649 · ray-project/ray · GitHub