How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hi.
I have a model trained with PPO algorithm in a multi-agent environment.
I want to train another model that learns the behavior of one agent(Only one agent out of the multi-agent models I trained.) using Imitation learning.
Imitation learning requires offline data, and I know ray provides a configuration to store the trajectory during training as a json file(Working With Offline Data — Ray 3.0.0.dev0).
(t’s a bit different problem, when I follow the example above, offline data is not created.
I only set up config like
config='{“output”: “/tmp/cartpole-out”} .
Do I need to set another setting to generate offline data?)
I wonder if there is an option to store only the trajectory corresponding to one agent among the policy being trained in multi agent environment(only one agent’s observation, reward, info, …).
and I want to know how to store offline data successfully.
I am sorry that there is no detailed code because it is in the project planning stage.
Can someone help me? Thank you very much.