To use the old API stack, set .api_stack(enable_rl_module_and_learner=False, enable_env_runner_and_connector_v2=False) in your config. Also, for DQN with multi-agent and replay buffer, use "type": "MultiAgentPrioritizedReplayBuffer" in replay_buffer_config (not "MultiAgentReplayBuffer"), as the old stack does not support the episode-based buffer. Example:
config = (
DQNConfig()
.framework("torch")
.environment("sumo_marl", env_config=env_config)
.api_stack(enable_rl_module_and_learner=False, enable_env_runner_and_connector_v2=False)
.env_runners(num_env_runners=1, num_envs_per_env_runner=1, num_cpus_per_env_runner=3, sample_timeout_s=50000)
.multi_agent(
policies=["shared"],
policy_mapping_fn=lambda agent_id, *a, **kw: "shared",
)
.learners(num_learners=0, num_cpus_per_learner=3)
.training(
gamma=0.99,
lr=1e-4,
num_steps_sampled_before_learning_starts=20_000,
train_batch_size=4096,
replay_buffer_config={
"type": "MultiAgentPrioritizedReplayBuffer",
"capacity": 300_000,
},
target_network_update_freq=8000,
double_q=True,
dueling=True,
n_step=1,
epsilon=EPS_SCHEDULE,
)
.callbacks(MetricsLoggerCallback)
.debugging(seed=SEED)
)
This avoids the new stack’s episode buffer and uses the supported prioritized buffer for multi-agent DQN.
Would you like a step-by-step explanation or more details?
Sources:
- Multi-agent replay buffer in DQN fails to run
- replay_buffer_api.py example
- New API stack migration guide
Hint: Mention @RunLLM in the post for followups.