Hi Ray community,
After training a PPO agent, I would like to export model / weights / policy to be able to apply the learnt policy on other environment instance.
What is best way to do so? Which aspects do need to be considered when using a custom TFModelV2?
I am asking that because I have experienced in the past that there are many “hidden” relations and dependencies when working with rllib. For example, I have already observed an open issue in rllib which makes me assume that keras
may also play a role in that game …
My request is not related to this thread, as in that case the export / saving works fine.
Thanks in advance for your advice!