Decentralised pre-trained policies loaded into multi-agent environment for further training and evaluation

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi, I’m trying to setup a workflow in a modified version of multi-agent WaterWorld (I’m using PettingZoo). First, I want to pre-train two agents individually in a single-agent version of the environment (as of now, I’m training the agents in two copies of this same single-agent environment, one per each of them). After this pre-training phase, I would like to save their checkpoints, and then load their policies to (possibly) further train and evaluate them in the multi-agent version of the environment. Is this possible? I’m also using Ray Tune and would like to keep using that to also do some hyperparameter tuning. Thanks a lot!

[By the way, I saw that there was a very similar question back in 2021, but I suppose that by now the answer to that is obsolete.]