Controlling inner loop of MAML

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hey everyone,

I am wondering how to control the inner loop gradient update in MAML. In PPO, we have the parameters train_batch_size, sgd_minibatch_size, and num_sgd_iter to control the batchsize and number of SGD iterations for the training process.

In MAML however, we have the parameters inner_adaptation_steps and train_batch_size. How do they control the inner loop gradient update? My understanding is, that if inner_adaptation_steps = 1 we collect all the samples of 1 episode and then we perform a gradient update. But how is the update prcoess working exactly? Do we perform several iterations of updating with minibatches like in PPO? Or is it just 1 gradient update with all samples of 1 episode? And what is the role of train_batch_size then?

Thanks for your help!


Hi @ElektroChan89 ,

  • I did not know this either, but num_sgd_iter does not affect MAML, because it uses RolloutWorker.learn_on_batch().
  • The update process works with workers.sync_weights(). So before and after each inner loop, weights are synced afaics.
  • train_batch_size does not play a role here, neither does sgd_minibatch_size
  • inner_adaptation_steps controls how many batches to optimize on before we return to the outer loop. Note that the logic for this in our code is an execution_plan - an API that we have been working to deprecate. That means that itr in the inner loop is an iterator that the inner loop pulls batches from until it has stepped “enough”.