How to use my pretrained model as policy and value netwok

Hi @cubpaw,

We have recently rolled out a new module and learning stack inside RLlib (RLModules) that gives you a lot of flexibility in achieving what you just explained. Here is the example code.

We recommend to use these nightly releases to use RLModule features.

Having said that it is still possible to do this type of stuff in the current stable release, it would just need more work. It involves two steps:

  1. You need to define a custom model that would initialize the actor and value function using the architecture that they were pertained on. At this stage you need to make sure that you can train an algorithm with this custom model via RLlib. Example:
    ray/rllib/examples/custom_rnn_model.py at master · ray-project/ray · GitHub
  2. You need to use callbacks’s on_algorithm_init hook to load the weights onto your custom model during the initialization.
    Related discourse q: Updating policy_mapping_fn while using tune.run() and restoring from a checkpoint - #3 by Muff2n
    Related example: ray/rllib/examples/restore_1_of_n_agents_from_checkpoint.py at master · ray-project/ray · GitHub

cc @avnishn

1 Like