Hi @davidADSP
In terms of multiagent I suggest you take a look at this issue:
This should enable you to get something out of get_policy() by specifying which policy you want.
Well, I don’t think what you would love is possible (tuner = ray.tune.Tuner(agent, …). But there are ways around changing the config and then do a new Tuner.fit with the new config from a previously saved checkpoint from tuner without loosing anything.
I not proficient in Torch but I think Torch models will have similar functions to save and load weights as in Tensorflow. This topic is sparse and fragmented in the official Ray documentation in general. It’s a petty because it deters potential new users from getting started with this great RL framework. I’ve made a toy example on my github that does what you want (two times tuning with different config) and take it all the way to production without the need to carry the overhead of Ray in the end and thus only relying on the ML-framework (Tensorflow ). At the end of the day that is probably what most users need.
Note that during the second Tuner.fit() “lr” is changed as well as “training_iteration”. lr=0.0 is to prove that the weights loaded initially are the previously trained/saved in the best checkpoint as they do not change when lr=0,0.
Take a look here:
In terms of your last point I’m not sure but my feeling is no, sorry.
BR
Jorgen