Two different method mapping policy to agents

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I was working on mapping two policies to different agents in multiagent environment and train it.
(‘policy A’ is set to ‘agent 0’ and ‘policy B’ is set to other agents except ‘agent0’)
And I found out that there was a way to do that work.

Setting config[“multiagent”][“policy_mapping_fn”] of "trainer"object.

I’m curious about the following thing.

I think just setting the “trainer” object’s [“multiagent”]["policy_mapping_fn] will map the policy accordingly and train it separately. I thought I didn’t have to do other operations in a custom environment. Am I right?

Thank you so much for reading it. I’ll be waiting for your answer.

Hi @coco,

You are correct, the policy mapping function. determines which policy is used for each agent. This is handled seamlessly for you during training by rllib.

The only requirement of your environment is that is provide a multiagent dictionary that provides a seperate observation, reward, done, and info value for each agent in a step. But, it is not required that every agent in the environment have an observation on every step.