Logging from DefaultCallbacks to LoggerCallback gives weird behavior

How severe does this issue affect your experience of using Ray?

  • Low: It annoys or frustrates me for a moment.

I have a logger that inherits from DefaultCallbacks and overwrites the on_episode_end and a class that extends the TBXLoggerCallback. I need to inherit from DefaultCallbacks because I need the environment to get the render (and gif but this example is only the render).
Now the problem is that the variable images in log_trial_result contains duplicates. And even after I tried for example result['episode_media']['images'] = [] after I wrote all the images it still kept some old images from result dicts from previous episodes. In the end I fixed it by keeping a variable that keeps track of all the images logged so far and if it is in this list I don’t log them anymore but ideally Ray will reset the result dict after log_trial_result or something like that. So I am wondering what causes this behavior. It is also not the case that everything gets kept by the way. At some point old images get removed from the result dict but not on every result.

What causes this behavior?

class RenderCallback(DefaultCallbacks):
    def on_episode_end(...):
          environment = base_env.get_sub_environments()[0]
          if environment.episode_counter % 10 == 0:
              image_to_save_to_TB = environment.get_render(...)
              episode.media['images'] = image_to_save_to_TB

class TBXLoggerExtended(TBXLoggerCallback):
    def log_trial_result(self, iteration: int, trial: "Trial", result: dict):
        super().log_trial_result(iteration, trial, result)
        images = result['episode_media']['images']
        for image in images:
            self._trial_writer[trial].add_image(trial.logdir, image)
           # result['episode_media']['images'] = []  # Even if this is uncommented it still has old images. 
1 Like

Hey @RaymondK , I think these are two different “callbacks” that you are referring to here.

Your RenderCallback is an RLlib one, so you have to specify it in your RLlib config like so:

from ray.rllib.algorithms.ppo import PPOConfig
config = PPOConfig().callbacks(RenderCallback)

# OR via old-style config dicts
from ray.rllib.algorithms.ppo import PPO
trainer = PPO(config={
    "callbacks": RenderCallback,
})

The TBX one is a sub-class of TrainerConfig (I know, very confusing, sorry), which has nothing to do with RLlib, but with a Ray Train Trainer (formerly known as Ray SGD and currently undergoing a transition to “Ray AIR”). Can you ask about how to set this up on the Train forum here?

1 Like