1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
High: Completely blocks me.
2. Environment:
- Ray version:
– 2.47.1 - Python version:
– 3.12.7 - OS:
– MacOs Apple M3 os: 15.5
3. What happened vs. what you expected:
- Expected:
– Log current epsilon value to Tensorboard via MetricLogger - Actual:
– I got error when run the code
site-packages/ray/rllib/algorithms/algorithm.py", line 2355, in get_policy
return self.env_runner.get_policy(policy_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'SingleAgentEnvRunner' object has no attribute 'get_policy'
I found that algorithm.get_poliy()
have @OldAPIStack
annotation, but I don’t know how to solved this on new APIStack.
This is my Callbacks to log to Tensorboard
class ActionStatsCallback(DefaultCallbacks):
def on_train_result(self, *, algorithm, metrics_logger, result, **kwargs) -> None:
"""Log the current exploration epsilon"""
policy = algorithm.get_policy()
epsilon = policy.get_exploration_state().get("cur_epsilon")
metrics_logger.log_value("cur_epsilon", float(epsilon))
This is my DQNConfig
algo_config = (
DQNConfig()
.environment("EnvTrain", env_config={})
.framework("torch")
.env_runners(num_cpus_per_env_runner=1,
num_env_runners=1,
num_envs_per_env_runner=2,
rollout_fragment_length='auto',
sample_timeout_s=120,
batch_mode="complete_episodes",
explore= True,
)
.resources(num_gpus=0,num_cpus_for_main_process=4)
.callbacks(ActionStatsCallback)
.evaluation(
evaluation_interval=2,
evaluation_duration=1,
evaluation_config={"env": "EnvVal", "env_config": {}},
)
.training(
model={
"fcnet_hiddens": [128, 64,],
"fcnet_activation": "relu",
"use_attention":True,
"use_lstm":True,
"lstm_use_prev_actions": tune.grid_search([32,16]),
"attention_use_n_prev_actions":8,
"attention_use_n_prev_rewards":8,
},
num_epochs=30,
n_step=4,
gamma=0.99,
num_atoms=51,
train_batch_size=64,
noisy=True,
dueling=True,
double_q=True,
)