[RLlib] Writing to tensorboard during custom evaluation

RickLan · February 17, 2021, 1:53am

Thank you @sven1977 for pointing out where to make the mod. I added the following code and have a small test. I read the contribution guidelines and it will take some time for me to get there, so I am posting my code here for those who are interested.

Add following to TBXLogger class (for location see @sven1977 message above):

                # Image Support (3, H, W)
                if isinstance(value, np.ndarray) \
                        and value.ndim == 3 \
                        and value.shape[0] == 3:
                    self._file_writer.add_image(
                        full_attr, value, global_step=step)
                    continue

Add following to TBXLoggerCallback class (for location see @sven1977 message above):

                # Image (3, H, W)
                if isinstance(value, np.ndarray) \
                        and value.ndim == 3 \
                        and value.shape[0] == 3:
                    self._trial_writer[trial].add_image(
                        full_attr, value, global_step=step)
                    continue

Test code.

import argparse
import gym
from gym.spaces import Discrete, Box
import numpy as np
import os

import ray
from ray import tune

parser = argparse.ArgumentParser()
parser.add_argument("--run", type=str, default="PPO")
parser.add_argument("--torch", action="store_true")
#parser.add_argument("--as-test", action="store_true")
parser.add_argument("--stop-iters", type=int, default=4)
#parser.add_argument("--stop-timesteps", type=int, default=10000)
#parser.add_argument("--stop-reward", type=float, default=0.1)

image_size = 16

def eval_fn(trainer, eval_workers):
    if not hasattr(eval_fn, "counter"):
        eval_fn.counter = 0  # it doesn't exist yet, so initialize it
        eval_fn.pic = np.zeros((3, image_size, image_size), dtype=np.uint8)  # it doesn't exist yet, so initialize it
        eval_fn.color = np.array([255, 0, 0])  # it doesn't exist yet, so initialize it

    eval_fn.pic[:, eval_fn.counter, eval_fn.counter] = eval_fn.color
    eval_fn.color = np.roll(eval_fn.color, 1)
    eval_fn.counter = (eval_fn.counter + 1) % image_size
    return {"counter" : eval_fn.counter, "pic" : eval_fn.pic}


class SimpleCorridor(gym.Env):
    """Example of a custom env in which you have to walk down a corridor.

    You can configure the length of the corridor via the env config."""

    def __init__(self, config):
        self.end_pos = config["corridor_length"]
        self.cur_pos = 0
        self.action_space = Discrete(2)
        self.observation_space = Box(
            0.0, self.end_pos, shape=(1, ), dtype=np.float32)

    def reset(self):
        self.cur_pos = 0
        return [self.cur_pos]

    def step(self, action):
        assert action in [0, 1], action
        if action == 0 and self.cur_pos > 0:
            self.cur_pos -= 1
        elif action == 1:
            self.cur_pos += 1
        done = self.cur_pos >= self.end_pos
        return [self.cur_pos], 1.0 if done else -0.1, done, {}



if __name__ == "__main__":
    args = parser.parse_args()
    ray.init()

    # Can also register the env creator function explicitly with:
    # register_env("corridor", lambda config: SimpleCorridor(config))

    config = {
        "env": SimpleCorridor,  # or "corridor" if registered above
        "env_config": {
            "corridor_length": 5,
        },
        # Use GPUs iff `RLLIB_NUM_GPUS` env var set to > 0.
        #"num_gpus": int(os.environ.get("RLLIB_NUM_GPUS", "0")),
        "lr": 1e-4,  # try different lrs
        "num_workers": 1,  # parallelism
        "framework": "torch" if args.torch else "tf",

        "custom_eval_function": eval_fn,

        # Enable evaluation, once per training iteration.
        "evaluation_interval": 1,
        "evaluation_num_episodes": 1,
    }

    stop = {
        "training_iteration": args.stop_iters,
        #"timesteps_total": args.stop_timesteps,
        #"episode_reward_mean": args.stop_reward,
    }

    results = tune.run(args.run, name="test_image_logger", config=config, stop=stop)

    ray.shutdown()

Topic		Replies	Views
How to log Render to tensorboard? RLlib	9	2471	July 22, 2021
Use Policy_Trainer with TensorBoard RLlib	33	2344	November 13, 2021
[Tune Class API + PyTorch] Possible to add more custom scalars+weights+biases to Tensorboard events file? Ray Tune	11	2646	March 31, 2021
Logging stuff in a custom gym environment using RLlib and Tune RLlib	4	1425	June 1, 2022
Possible to access default logger from environment? RLlib	15	1462	April 27, 2021

[RLlib] Writing to tensorboard during custom evaluation

Related topics