Num_agent_steps_trained: 0

tarkatandava · April 4, 2024, 4:28am

Hi all,
I am new to RAY/RLLib. I need help with a custom environment I have created. Here is a snapshot of the log when I train PPO on a custom environment I have built.
The counters num_agent_steps_trained and num_env_steps_trained is 0. Also “done = False” even though I set don = True when 500 steps are sampled.
What are possible things I should be checking or possible errors I should be looking for to correct this ?
I notice that there is no learning happening, the mean reward per episode does not seem to happen. Also there are two counters num_agent_steps_trained and num_env_steps_trained with the same name. What is the difference between the two ?
I have set the episode length to 500, by setting done=True when 500 steps are sampled.
Below is part of the log. Any help will be greatly appreciated.

Thanks,
Tarka

counters:
num_agent_steps_sampled: 43000
num_agent_steps_trained: 0
num_env_steps_sampled: 43000
num_env_steps_trained: 0
custom_metrics: {}
date: 2024-04-04_08-20-36
done: false
episode_len_mean: 500.0
episode_media: {}
episode_reward_max: -625.7630598837428
episode_reward_mean: -1059.0733917539524
episode_reward_min: -1620.7717931357777
episodes_this_iter: 2
episodes_total: 86
hostname: E-CND2273HMV
info:
learner:
all:
num_agent_steps_trained: 128.0
num_env_steps_trained: 1000.0
total_loss: 2488.746065732266

tarkatandava · April 7, 2024, 6:46am

I seem to have overcome this problem by setting
config.training(_enable_learner_api=False)
config.rl_module(_enable_rl_module_api=False)
Not sure if it is the right thing to have done. But now “num_agent_steps_trained” is equal to the “num_agent_steps sampled”.

Philipp_D_Siedler · May 4, 2024, 5:01pm

Wondering the same thing, any update on this?

Topic		Replies	Views
Num_env & agent_steps_trained 0 even though steps sampled? RLlib	7	884	April 25, 2024
Understanding agent_timesteps_total RLlib	2	582	February 3, 2023
Is the NUM_ENV_STEPS_TRAINED logged incorrectly, if not how to interpret it compared to NUM_MODULE_STEPS_TRAINED? Configure Algorithm, Training, Evaluation, Scaling	1	22	September 16, 2025
Is there a way to set num_env_steps_sampled? RLlib	1	530	June 23, 2023
Unable to replicate original PPO performance RLlib	0	181	May 10, 2024

Num_agent_steps_trained: 0

Related topics