How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hi all, I am currently a bit confused over the new API stack involving RLModules and Policies, especially regarding a multi-agent setting. In the example shown in the documentations:
import gymnasium as gym
from ray.rllib.core.rl_module.rl_module import SingleAgentRLModuleSpec
from ray.rllib.core.rl_module.marl_module import MultiAgentRLModuleSpec
spec = MultiAgentRLModuleSpec(
marl_module_class=BCTorchMultiAgentModuleWithSharedEncoder,
module_specs={
"local_2d": SingleAgentRLModuleSpec(
observation_space=gym.spaces.Dict(
{
"global": gym.spaces.Box(low=-1, high=1, shape=(2,)),
"local": gym.spaces.Box(low=-1, high=1, shape=(2,)),
}
),
action_space=gym.spaces.Discrete(2),
model_config_dict={"fcnet_hiddens": [64]},
),
"local_5d": SingleAgentRLModuleSpec(
observation_space=gym.spaces.Dict(
{
"global": gym.spaces.Box(low=-1, high=1, shape=(2,)),
"local": gym.spaces.Box(low=-1, high=1, shape=(5,)),
}
),
action_space=gym.spaces.Discrete(5),
model_config_dict={"fcnet_hiddens": [64]},
),
},
)
module = spec.build()
The MultiAgentRLModuleSpec expects module_ids. It is not clear whether the module_id should match the agent_id or policy_id. I have been defining the policy_mapping_fn in the old API that maps agent_id to policy_id, but I am not entirely sure how module_id fits into this formulation. For example,
config = PPOConfig()
.environment(env=env_name, clip_actions=True, disable_env_checking=True)
.rollouts(num_rollout_workers=4, rollout_fragment_length=128)
.multiagent(
policies=env.get_agent_ids(),
policy_mapping_fn=(lambda agent_id, *args, **kwargs: agent_id),
)
So the question is if I use the new experimental API, and use the rlmodule() method with relevant multiagent specs, do I still need to use the multiagent() method on top of that? And if so, how does the policy_mapping_fn work in relation to the module_id.
Thank you in advance.