How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Best Way to collect stats about actions taken during training?
I am running training loops with the PPOTrainer and I would like to know how the action distribution changes over training episodes. What is the best way to get the actions taken during training?
My current training loop
rllib_trainer = PPOTrainer(config=config)
results = []
episode_data = []
num_iterations = 10
for n in range(num_iterations):
result = rllib_trainer.train()
results.append(result)
# store relevant metrics from the result dict to the episode dict
episode = {
"n": n,
"episode_reward_min": result["episode_reward_min"],
"episode_reward_mean": result["episode_reward_mean"],
"episode_reward_max": result["episode_reward_max"],
"episode_len_mean": result["episode_len_mean"],
}
episode_data.append(episode)
# store results every iteration
result_df = pd.DataFrame(data=episode_data)
result_df.to_csv(result_file, index=False)
Can I somehow access the action-distribution via train() results?