RLlib Multi-Agent/ReplayBuffer DQN/SAC Error: Agents with Different Observation Space Shapes

Hi everyone!
I hope you’re all having a great day. I’m currently working on a project using RLlib for Multi-Agent RL and I’ve encountered a problem that I could use some help with. I’ll try to provide as much detail as possible, so please bear with me!

I’m working with multiple agents, each having a different observation space. Everything seems to be working fine when I use on-policy algorithms like PPO and A3C. However, when I switch to off-policy algorithms like SAC and DQN, I’m getting an error that I’m not sure how to fix. The error message is:

"AssertionError: built_steps (1) + ongoing_steps (1) != rollout_fragment_length (1)."

To give you more context, here’s the config I’m using to run my experiment:

algorithm = DQNConfig()
   batch_size = 1024
   config = (
           .training(gamma=0.99, lr=0.00005,dueling=True,double_q=True,before_learn_on_batch=True,
                   replay_buffer_config={'_enable_replay_buffer_api': True, 'type': 'MultiAgentReplayBuffer', 'capacity': 50000, 'replay_sequence_length': 1,})
                   "agent_one": (
                   "agent_two": (
                   "agent_three": (
                      num_gpus=int(os.environ.get("RLLIB_NUM_GPUS", "0")))

As you can see, I’ve set up three agents with their own observation and action spaces in the multi_agent configuration.

My main question is: What does this error message mean, and how can I resolve it?
I’m also curious whether the replay buffer in RLlib can handle multiple agents with different observation space shapes.

I would really appreciate any insights or suggestions you can offer.
Your help would mean a lot to me, and I’m eager to learn from your experiences.

Thank you so much for your time and consideration! :blush:

Thank you for raising this! RLlib ReplayBuffers should work with a multi-agent environment with different observation spaces. Could you provide a repro script with your environment and policy_mapping_fn?

I tested ray/multi_agent_different_spaces_for_agents.py at master · ray-project/ray · GitHub with PPO, DQN, and SAC, and could not reproduce the issue.