How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Problem:
I am using RLLib and Tune together and my environment (Gym environment) keeps track of a certain state. Say for simplicity the number of episodes. But actually a more complicated state.
Now if I stop the experiment before it is finished and resume it I want to make sure all environments that are created are set to the state at which the experiment was stopped. Say if one environment was at episode 2 and another at 3 they will load a variable self.episodes = 2 and self.episodes = 3 or something similar.
So that if I resume I get exactly the same results (given that I have set a seed) then if I wouldn’t resume and let my experiment run. But since the environments are initialized in their begin state the results are actually different.
I have looked at the tune.checkpoint function API
https://docs.ray.io/en/latest/tune/api_docs/trainable.html#tune-function-docstring
But here I somehow need a function with an argument checkpoint_dir which I don’t know how I can have access to in the environment.
Before I used Ray my environment had a set_state function that gets called when a checkpoint is saved and a get_state function that gets called when there is resumed. And this is basically the functionality I want.
Let me know if this is possible.
Edit: To clarify a bit more. Consider the scenario where I have two runs, run_1 and run_2 with Tune the checkpoints get stored in a format name_of_grid_search_experiment/run_1 and name_of_grid_search_experiment/run_2. Now if I stop the experiment then for example run_1 is by episode 45 and run_2 by episode 46. I want to save those numbers so that if I resume the environment of run_1 knows it is by episode 45 and run_2 by episode 46. How to do this?
I cannot simply save a certain checkpoint because I need to store it in the checkpoint that Tune already generates, the checkpoint_00001 folder etc. I want to put it in such a folder. But the environment doesn’t know the directory and file name of the currently saved checkpoint and when loading (resuming) a model it also doesn’t know which folder to look for.