Thanks for your great work in RLLib, we all much enjoy it! However, we met a problem in how to realize this feature, we appreciate it if can help to advise us of your suggested solution in RLLib.
Problem Description:
This problem requires one more layer of transfer learning on top of another trained policy. The trained policy should be frozen during training and the new layer will be trained.
Our Naive Solution:
Our plan is to use your custom_train_workflow related features. During custom training workflow, we select the part of the model’s parameters to freeze, and some other parts of parameters to train.
Thanks for your help let us know if our solution is correct and follows the rllib philosophy. Or you already have a much easier solution for that.
Glad we could help.
If you want to load a complete model of a previously trained policy, the easiest way is to call the restore method of your Trainer. From the docs: