"No data for "policy_name", not updating kl" warning

Hi!,

I am using the simple tag environment from Pettingzoo https://www.pettingzoo.ml/mpe/simple_tag that has two agent types: “agent_#” and “adversary_#”. I use 3 “agents” and 6 “adversaries” When I try to use one policy for “agent_#” named “shared_1” and one another for “adversary_#”, named “shared_2”, I get the warning: No data for shared_2, not updating kl.

As I understand the shared_2 policy is not used. When I try to use independent learning (one policy for each agent regardless of their type) or 2 shared policies but with 1 “agent” and 3 “adversaries”, which is the custom format I don’t get this warning.

Do you know why this is happening or if it is RLlib or environment related?

Please find attached the code.
Thanks,
George


from ray import tune
import ray
from ray.tune.registry import register_env
from ray.rllib.env.wrappers.pettingzoo_env import PettingZooEnv
from pettingzoo.mpe import simple_tag_v2
from supersuit import pad_observations_v0


if __name__ == "__main__":

    ray.init()

    def env_creator(args):
        env = simple_tag_v2.env(num_good=3, num_adversaries=6, num_obstacles=2, max_cycles=25)
        env = pad_observations_v0(env)
        return env

    register_env("simple_tag", lambda args: PettingZooEnv(env_creator(args)))
    test_env = PettingZooEnv(env_creator({}))
    obs_space = test_env.observation_space
    act_spc = test_env.action_space

    policies = {"shared_1": (None, obs_space, act_spc, {}),
                "shared_2": (None, obs_space, act_spc, {})
                }

    policy_ids = list(policies.keys())


    def policy_mapping_fn(agent_id):
        if agent_id == "agent_0" or "agent_1" or "agent_2":
            return "shared_1"
        else:
            return "shared_2"

    tune.run(
        "PPO",
        name="PPO simple_tag trial 1",
        stop={"episodes_total": 50000},
        checkpoint_freq=10,
        config={
            # Enviroment specific
            "env": "simple_tag",
            # General
            "framework": "torch",
            "num_gpus": 0,
            "num_workers": 0,
            # Method specific
            "multiagent": {
                "policies": policies,
                "policy_mapping_fn": policy_mapping_fn,
            },
        },
    )

Hi,

I also would like to ask if anyone knows exactly the meaning of this warning, in order to know the direction I have to search, since I cannot find much on other questions regarding this.

Thanks