Tune and custom logger fails

Hi,

I am using a custom callback that works fine when I train an RL model without using tune:

config = ppo.PPOConfig(
        ).environment(MyCustomEnv, env_config=env_config, disable_env_checking=True
        ).training(
            model={
                'custom_model': model_class_name,
                'custom_model_config': custom_model_config
                },
        ).framework(eager_tracing=True, framework=framework
        ).resources(num_gpus=1, num_cpus_per_worker=1, num_cpus_per_learner_worker=2,
        ).rollouts(enable_connectors=False, num_rollout_workers=2).callbacks(MyCallbacks
        ).reporting(keep_per_episode_custom_metrics=True)
    
    algo = config.build()
    
    num_iterations = 2
    for _ in range(num_iterations):
        result = algo.train()

However, when I try to use tune. Namely:

    tuner = tune.Tuner(
        "PPO",
        run_config=air.RunConfig(
            storage_path=run_path,
            stop={
                "training_iteration": num_iterations,
            },
            # callbacks=[MyCallbacks],
        ),
        param_space=config,
    )
result = tuner.fit()

The custom callback is ignored. As stated above, the custom callback works fine when tune is not used.
If I try to include the custom callback also in my air.RunConfig by commenting out the line above, I receive the following error:

AttributeError: type object 'MyCallbacks' has no attribute 'setup'

I also tried the following variation: callbacks=[MyCallbacks()]. This leads to the same error.

My current ray version is 2.5.1 (pip install ray[default,rllib]==2.5.1)

I am having the same issue on Ray 2.20.0 (updated from 2.8.0 which also had the same problem). Any help is appreciated.

Replying to myself:

I was able to get it to run by subclassing tune.Callback but have not yet been able to see the custom metrics in the output. Simplified Callback class for posterity:

import numpy as np
from collections import deque
# from ray.rllib.algorithms.callbacks import DefaultCallbacks
from ray.tune import Callback
from ray.rllib.env import BaseEnv
from ray.rllib.policy import Policy
from ray.rllib.evaluation import RolloutWorker
from ray.rllib.evaluation.episode import Episode
from typing import Dict


class MarketCallbacks(Callback):
    def on_episode_start(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        # Initialize storage for prices in this episode
        episode.user_data["prices"] = []

    def on_episode_step(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        # Retrieve the market object from the environment
        market = base_env.get_unwrapped()[
            env_index
        ].market
        current_price = market.get_current_price()
        episode.user_data["prices"].append(current_price)

    def on_episode_end(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        prices = np.array(episode.user_data["prices"])
        episode.custom_metrics["max_price"] = np.max(prices)
        episode.custom_metrics["mean_price"] = np.mean(prices)
        episode.custom_metrics["std_price"] = np.std(prices)

        episode.hist_data["price"] = episode.user_data["price"]

    def on_train_result(self, *, algorithm, result: dict, **kwargs):
        # you can mutate the result dict to add new fields to return
        result["callback_ok"] = True

        p = result["custom_metrics"]["price"]
        result["custom_metrics"]["prices_max"] = np.max(p)
        result["custom_metrics"]["prices_mean"] = np.mean(p)
        result["custom_metrics"]["prices_std"] = np.std(p)

Hi @jacob-thrackle,

The terminology is overloaded which makes this confusing. The tune callbacks and rllib callbacks and they are different types. You are providing an rllib callback to the tune config. You need to pass it to the rllib AlgorithmConfig.

I could not find an example in the project but here is an issue which shows an example using both an rllib callback and a tune callback. You may need to adapt it to use the builder pattern instead if a dictionary.

https://docs.ray.io/en/latest/rllib/package_ref/doc/ray.rllib.algorithms.algorithm_config.AlgorithmConfig.html#ray.rllib.algorithms.algorithm_config.AlgorithmConfig

1 Like