[RLlib] Continuing env, horizon and soft_horizon

albheim · March 18, 2021, 4:45pm

Hi,

I have an environment that is continuing (no episodes) and was wondering what the expected way was to use horizon/soft_horizon/no_done_at_end.

I assumed horizon decided when logging was done (i.e. over what number of steps we take the min/mean/max values for the logged metrics) and set it to 1000 since that seemed reasonable for my env. I also set soft_horizon to true since I don’t want my env to reset and I set no_done_at_end since I don’t want done to be true ever.

I have been logging data using custom_metrics in a DefaultCallback, and noticed this was not done at the same intervals as the horizon. So now I am a bit curious at what it is that triggers the logging, and what horizon does for me in a continuing environment?

albheim · March 18, 2021, 5:06pm

Figured out that train_batch_size seems to set the length of the min/mean/max ldata collection and thus also the logging interval. Is this documented somewhere? I can’t find it and it is not obvious to me why this value should set the logging intervals.

Topic		Replies	Views
`horizon` and `no_done_at_end` in combination with `PolicyClient` resp. `ExternalEnv` RLlib	0	227	June 17, 2021
Possible to access default logger from environment? RLlib	15	1462	April 27, 2021
Custom_metrics not reporting as expected RLlib	4	360	March 3, 2021
Setting for Infinite Horizon MDPs RLlib	4	1617	June 15, 2021
Horizon curriculum in generative adversarial imitation learning RLlib	4	348	May 4, 2021

[RLlib] Continuing env, horizon and soft_horizon

Related topics