When does an environment reset()?

Ofir_Abu · October 5, 2021, 5:48pm

Hi,
Say I’m running some A3C agents on an arbitrary environment usign tune.run().
When I see the reward over time steps graphs I see that there are a lot of timesteps (10M) but how do I know when the environment had been reset if at all?

If I don’t call env.reset() in my code - does the agents take steps in the environment until they’re done?

arturn · October 5, 2021, 7:22pm

Hi @Ofir_Abu,

Generally an RLLib worker calls the reset() method of it’s environment instance after the environment returns a done=True from the step() call. The dones are part of the experiences that your workers produce. You can access these experiences and log them if you are interested. Otherwise they are accumulated in their own metrics: episodes_this_iter !

Cheers

mannyv · October 5, 2021, 11:44pm

@Ofir_Abu there is also a horizon key in the config. If you set that to an integer n then rllib will artificially end the episode after that many steps of the environment, store a done=True, and call reset.

Archana_R · February 7, 2023, 8:55am

My code is as below

for _ in range(1): ----------------------------------------------> isnt this the episode number ?
result = agent.train()

Since the episode number is 1 , shouldn’t the reset function be called ONCE ? For me the reset function is called 3-4 times in this 1 episode.

Batch size for debug purposes is 128 .

Can you please share your feedback ?

mannyv · February 7, 2023, 11:49am

Hi @Archana_R,

Each call to train collects train_batch_size new steps from the environments.

It will reset the environment as many times as it needs to in order to collect that many steps.

It resets the environment every time it returns done or if horizon is set and it reaches that many steps in the episode.

Archana_R · February 7, 2023, 1:49pm

This helps. thanks !

Topic		Replies	Views
Callbacks.on_episode_step called an extra time during the first episode played (after the first call to env.reset) RLlib	5	819	April 9, 2021
Max_episode_steps attribute in customized environment RLlib	3	2125	April 14, 2023
How does Ray RLLib handle individual agent EndEpisode calls in Unity 3D environments? RLlib	0	18	October 28, 2024
Rllib checkpointing environment in Tune RLlib	1	424	June 2, 2022
Num_agent_steps_trained: 0 Configure Algorithm, Training, Evaluation, Scaling	2	242	May 4, 2024

When does an environment reset()?

Related topics