How to log current Epsilon value on Ray Tune?

armmer016 · July 4, 2025, 9:21am

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.

2. Environment:

Ray version:
– 2.47.1
Python version:
– 3.12.7
OS:
– MacOs Apple M3 os: 15.5

3. What happened vs. what you expected:

Expected:
– Log current epsilon value to Tensorboard via MetricLogger
Actual:
– I got error when run the code

site-packages/ray/rllib/algorithms/algorithm.py", line 2355, in get_policy
    return self.env_runner.get_policy(policy_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'SingleAgentEnvRunner' object has no attribute 'get_policy'

I found that algorithm.get_poliy() have @OldAPIStack annotation, but I don’t know how to solved this on new APIStack.

This is my Callbacks to log to Tensorboard

class ActionStatsCallback(DefaultCallbacks):
    def on_train_result(self, *, algorithm, metrics_logger, result, **kwargs) -> None:
        """Log the current exploration epsilon"""

        policy = algorithm.get_policy()
        epsilon = policy.get_exploration_state().get("cur_epsilon")
        metrics_logger.log_value("cur_epsilon", float(epsilon))

This is my DQNConfig

algo_config = (
        DQNConfig()
        .environment("EnvTrain", env_config={})
        .framework("torch")
        .env_runners(num_cpus_per_env_runner=1,
                     num_env_runners=1,
                     num_envs_per_env_runner=2,
                     rollout_fragment_length='auto',
                     sample_timeout_s=120,
                     batch_mode="complete_episodes",
                     explore= True,
                    )
        .resources(num_gpus=0,num_cpus_for_main_process=4)
        .callbacks(ActionStatsCallback)
        .evaluation(
            evaluation_interval=2,
            evaluation_duration=1,
            evaluation_config={"env": "EnvVal", "env_config": {}},
        )
        .training(
            model={
                    "fcnet_hiddens": [128, 64,], 
                    "fcnet_activation": "relu",
                    "use_attention":True,
                    "use_lstm":True,
                    "lstm_use_prev_actions": tune.grid_search([32,16]),
                    "attention_use_n_prev_actions":8,
                    "attention_use_n_prev_rewards":8,
                },
            num_epochs=30,
            n_step=4,
            gamma=0.99,
            num_atoms=51,
            train_batch_size=64,
            noisy=True,
            dueling=True,
            double_q=True,
    )

PhilippWillms · July 15, 2025, 6:47pm

For me, this rather looks like an RLlib topic than a Tune topic. @arturn : What do you think?

arturn · August 1, 2025, 7:56pm

Yes, thanks for bringing this up.
Logging to tensorboard is connected to tune. But the error you are running into is RLlib related.

You hit an issue there. You local env runner needs to be a RolloutWorker (old API stack).
You need to disable enable_env_runner_and_connector_v2 to use RolloutWorkers.

If you are using the new API stack, you’ll be able to fetch the epsilon from the DQN RL Module. Grab the module and run → dqn_rl_module.epsilon_schedule.get_current_value()

Topic		Replies	Views
Dqn algo epsilon not logged RLlib	3	358	December 1, 2022
How to get the current epsilon value after a training iteration? RLlib	10	1139	July 28, 2022
Tune.run is not emitting tensorboard log file Ray Tune	0	266	August 11, 2021
Obtain in eager mode Actor model output in ray.tune mode using DefaultCallbacks class RLlib	0	229	October 13, 2021
Tune.run is not writing tensorboard log file RLlib	4	840	August 24, 2021

How to log current Epsilon value on Ray Tune?

Related topics