How severe does this issue affect your experience of using Ray?
- Medium: It contributes to significant difficulty to complete my task, but I can work around it.
 
Best Way to collect stats about actions taken during training?
I am running training loops with the PPOTrainer and I would like to know how the action distribution changes over training episodes. What is the best way to get the actions taken during training?
My current training loop
rllib_trainer = PPOTrainer(config=config)
results = []
episode_data = []
num_iterations = 10
for n in range(num_iterations):
    result = rllib_trainer.train()
    results.append(result)
    # store relevant metrics from the result dict to the episode dict
    episode = {
        "n": n,
        "episode_reward_min": result["episode_reward_min"],
        "episode_reward_mean": result["episode_reward_mean"],
        "episode_reward_max": result["episode_reward_max"],
        "episode_len_mean": result["episode_len_mean"],
    }
    episode_data.append(episode)
    # store results every iteration
    result_df = pd.DataFrame(data=episode_data)
    result_df.to_csv(result_file, index=False)
Can I somehow access the action-distribution via train() results?