Finetuning MBMPO policy

Thank you @arturn for your quick response, it’s helpful.

I understand that MAML policy needs to be fine-tuned and it is possible directly using PPO algorithm of RLlib (This thread mentions it and it has been tested also: MAML finetune adaptation step for inference).

Is there any better approach in RLlib to fine-tune MBMPO policy?

Regards