How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hi all, I am currently a bit confused over the new API stack involving RLModules and Policies, especially regarding a multi-agent setting. In the example shown in the documentations:
import gymnasium as gym
from ray.rllib.core.rl_module.rl_module import SingleAgentRLModuleSpec
from ray.rllib.core.rl_module.marl_module import MultiAgentRLModuleSpec
spec = MultiAgentRLModuleSpec(
marl_module_class=BCTorchMultiAgentModuleWithSharedEncoder,
module_specs={
"local_2d": SingleAgentRLModuleSpec(
observation_space=gym.spaces.Dict(
{
"global": gym.spaces.Box(low=-1, high=1, shape=(2,)),
"local": gym.spaces.Box(low=-1, high=1, shape=(2,)),
}
),
action_space=gym.spaces.Discrete(2),
model_config_dict={"fcnet_hiddens": [64]},
),
"local_5d": SingleAgentRLModuleSpec(
observation_space=gym.spaces.Dict(
{
"global": gym.spaces.Box(low=-1, high=1, shape=(2,)),
"local": gym.spaces.Box(low=-1, high=1, shape=(5,)),
}
),
action_space=gym.spaces.Discrete(5),
model_config_dict={"fcnet_hiddens": [64]},
),
},
)
module = spec.build()
The MultiAgentRLModuleSpec
expects module_id
s. It is not clear whether the module_id
should match the agent_id
or policy_id
. I have been defining the policy_mapping_fn
in the old API that maps agent_id
to policy_id
, but I am not entirely sure how module_id
fits into this formulation. For example,
config = PPOConfig()
.environment(env=env_name, clip_actions=True, disable_env_checking=True)
.rollouts(num_rollout_workers=4, rollout_fragment_length=128)
.multiagent(
policies=env.get_agent_ids(),
policy_mapping_fn=(lambda agent_id, *args, **kwargs: agent_id),
)
So the question is if I use the new experimental API, and use the rlmodule()
method with relevant multiagent specs, do I still need to use the multiagent()
method on top of that? And if so, how does the policy_mapping_fn
work in relation to the module_id
.
Thank you in advance.