Tune as part of curriculum training

I think one way to do this is by callbacks.
Notice that in the example Artur gave you, what we essentially do is to load a pre-trained policy checkpoint into the algorithm, and use that as the baseline for further Training.

You can actually do that with a callback like on_algorithm_start():

You have access to the algorithm, and you just need to create a policy from the checkpoint you want to resume, and do algorithm.add_policy(policy=<your reloaded policy>):

Give this a try. Your project sounds really exciting actually :slight_smile: