I want to save offline data for imaging learning. When it is a multi-agent environment, can only the trajectory of a particular agent be stored?

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I have a model trained with PPO algorithm in a multi-agent environment.
I want to train another model that learns the behavior of one agent(Only one agent out of the multi-agent models I trained.) using Imitation learning.
Imitation learning requires offline data, and I know ray provides a configuration to store the trajectory during training as a json file(Working With Offline Data — Ray 3.0.0.dev0).
(t’s a bit different problem, when I follow the example above, offline data is not created.
I only set up config like
config='{“output”: “/tmp/cartpole-out”} .
Do I need to set another setting to generate offline data?)

I wonder if there is an option to store only the trajectory corresponding to one agent among the policy being trained in multi agent environment(only one agent’s observation, reward, info, …).
and I want to know how to store offline data successfully.
I am sorry that there is no detailed code because it is in the project planning stage.
Can someone help me? Thank you very much. :grin:

Hi @coco

RLlib does currently not offer such a routine.
I have started a PR on that topic a while ago.

You can do two things here:

  • Write a script that modifies the recorded multi-agent data in place after it has been recorded
  • Modify RLlib and maybe contribute. We are always very happy for community PRs :slightly_smiling_face: