[rllib] MAML with another agent than PPO?

Hello,

We’d like to use SAC with MAML so that we can incorporate some offline training. Is this fairly straightforward in RLLib? If not, can anyone suggest a minimal course of action to implement this?

Hi @lucas_spangher,

How are you imagining this training would go?

Like this?

  1. Pretrain with CQL with offline data
  2. Train with MAML

Or like this?
Loop

  1. Train with CQL for x iterations
  2. Train with MAML for y iterations
    Repeat

You could also save offline data from MAML to use with CQL but there would be some bookkeeping to make sure the environments matched.

Also the RLLIB CQL implementation is very new so there may be some lingering issues that have not been found yet.