Logging stats side channels in custom Unity Environment using ML-Agents Academy.Instance.StatsRecorder

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Heya, I have a custom environment built in Unity. In Unity, I use ML-Agents, which also provides an Academy.Instance.StatsRecorder to track custom data in the environment. This is how I was hoping this would work:

    from mlagents_envs.side_channel.environment_parameters_channel import EnvironmentParametersChannel
    from mlagents_envs.side_channel.stats_side_channel import StatsSideChannel
    
    env_setup_channel = EnvironmentParametersChannel()
    env_setup_channel .set_float_parameter("training", 1)

    # This is what I'm interested in:
    stats_channel = StatsSideChannel()

    tune.register_env(
        "unity3d",
        lambda c: CustomEnv(
            file_name=FILE_NAME,
            no_graphics=True,
            episode_horizon=EPISODE_HORIZON,
            side_channel=[env_setup_channel, stats_channel],
        ),
    )

The env_setup_channel is passing parameters to the Unity environment, which works well when I use RLLibs tune.Tuner to run training:

    results = tune.Tuner(
        "PPO",
        param_space=config.to_dict(),
        run_config=air.RunConfig(
            stop=stop,
            verbose=2,
            checkpoint_config=air.CheckpointConfig(
                checkpoint_frequency=5,
                checkpoint_at_end=True,
            ),
        ),
    ).fit()

However I was expecting for the stats that I’m adding to my Academy.Instance.StatsRecorder in the Unity environment, would show up when I open the training results in tensorboard. (this is the behaviour in Unitys ML-Agents)

Am I missing something? Maybe sven1977 has an idea? Thank you!

I just found out how to do it:

  1. Create a custom callback:
from ray.rllib.algorithms.callbacks import DefaultCallbacks

class CustomMetricsCallback(DefaultCallbacks):
    def on_episode_end(self, worker, base_env, policies, episode, **kwargs):
        unity_env = base_env.get_sub_environments()[0]
        stats_side_channel = unity_env.stats_channel.stats

        for key, metric in stats_side_channel.items():
            episode.custom_metrics[key] = sum([value[0] for value in metric]) / len(
                metric
            )
  1. Then add it to the config:
config = (
    PPOConfig()
    ...
    .callbacks(CustomMetricsCallback)
)