How to tell RLLIB trainer (Not Tune) to run that many number of episodes

Arif_Jahangir · August 23, 2021, 6:31pm

How to tell RLLIB trainer (Not tune) to run that many number of episodes

Lars_Simon_Zehnder · August 24, 2021, 7:44am

Hi @Arif_Jahangir and welcome to the forum,

your question depends on your setup, i.e. are you using in your config "batch_mode"="complete_episodes" or "batch_mode"="truncate_episodes" (see here for a list of all configuration parameters)?

The number of episodes depends on multiple parameters: "batch_mode"="complete_episodes" would collect in a single batch only complete episodes, while "truncate_episodes" would mean that episodes can be truncated and where they get truncated depends on the configuration parameter "rollout_fragment_length" and the parameter "count_steps_by".

That said, your episodes might take many many steps through the environment until an episode ends - so configuration parameters should fit your experiment well. I suggest you to read the RLlib documentation about Sample Collection where you find good information about how episodes get collected and what to think about when choosing corresponding configuration parameters.

See also this discussion for understanding environment and agent steps.

Best,
Simon

AJ2413 · August 24, 2021, 2:01pm

Thanks @Lars_Simon_Zehnder for your detail answer.

Arif_Jahangir1 · August 24, 2021, 7:03pm

I have a csv file that has 1940748 rows and episode end when these rows are all read.
I want to run PPO to go through these 1940748 rows, 200 times (two hundred episodes)
So would my config will be as follows

“batch_mode”: “truncate_episodes”,
“horizon” : 1940748*200,
“soft_horizon”: True,
“no_done_at_end”: True,

Please tell me if the above configuration is correct for running 200 episodes if each episodes has 1940748 steps.

Lars_Simon_Zehnder · August 26, 2021, 10:03am

Hi @Arif_Jahangir1,

as written in the section Common Parameters of the Training API, setting no_done_at_end to False and soft_horizon to True results in not resetting the environment at the horizon, but adding done=True to the end of the horizon.

As you want to run 200 episodes, I guess you need to set the horizon to 1940748. If you do want the environment to run after these 1940748 rows again through the rows ,the environment needs probably to be reset and so soft_horizon needs to be set to False - only, if there are other variables in the environment that need to be kept after an episode I would set soft_horizon=True. Then, also no_done_at_end should be set to False as this signalizes RLlib that the episode is over.

Hope this helps,
Simon

Amit_Ruhela · June 9, 2023, 2:30am

Arif, did you find the correct approach for your query? If yes, can you share your implementation details and experience?

Jules_Damji · June 9, 2023, 5:34pm

cc: @arturn @Rohan138 any response

arturn · June 9, 2023, 6:27pm

I agree with @Lars_Simon_Zehnder

Topic		Replies	Views
How to tell RLLIB tune to run that many number of episodes RLlib	1	207	August 14, 2021
How to indicate to RLLIB tune to run 200 episodes Ray Tune	1	325	October 26, 2021
How to tell RLLIB trainer to run that many number of episodes	0	373	August 23, 2021
Train on episode end RLlib	1	371	December 28, 2021
Is there a mix between truncate_episodes and complete_episodes? RLlib	0	238	July 20, 2022

How to tell RLLIB trainer (Not Tune) to run that many number of episodes

Related topics