`horizon` and `no_done_at_end` in combination with `PolicyClient` resp. `ExternalEnv`

Hello guys!

I want to make a switch from continuous to episodic consideration for my problem I’m working on.
My env/simulator is an external one, thus I use the classes PolicyClient resp. ExternalEnv and now what I intend to do is to stop the episode after let’s say x hours of simulation time, reset the env and start a new next episode.
This is how I’ve thought of doing it:

  1. I let config param horizon untouched (i.e. Noneinf) but manually check in my simulator if x hours of simulation time (“custom horizon”) is hit. If so, the client records last rewards (client.log_returns), ends current episode (client.end_episode), I reset my env and finally the client starts a new next episode (client.start_episode).

  2. I set the config param no_done_at_end = True since I manually force the env episode to terminate instead of really reaching a terminate state.

The idea here is to obtain logical episodes and to repeat a simulation scenario from its start state.

What would you say, is this the correct way of doing it?

BTW: Under this circumstances, to mimic the behavior of soft_horizon = True (i.e. no reset after hitting a horizon), would it be enough to just skip the reset of my external env/simulator, that is, client.end_episodereset envclient.start_episode?