How to do MARL with different policies using Ray Tune?

Hello everyone!

I am having a lot of fun with RLlib/Ray. It is very powerful!

I want to do Multi-Agent Deep Reinforcement Learning with mixed policies, and I am looking for some inspiration. So far, I already mastered multiple agents with shared policy. Next, would be multiple agents with multiple policies.

This seems to be straightforward: Scaling Multi-Agent Reinforcement Learning – The Berkeley Artificial Intelligence Research Blog
Please scroll down to “Level 2: Multiple agents, multiple policies” (sorry for the inconvenience, there is no HTML anchor).

BUT! I want to use Ray Tune to do that. This does not seem to be trivial.

Do you have any ideas?