Controlling inner loop of MAML

ElektroChan89 · December 16, 2022, 2:35pm

How severe does this issue affect your experience of using Ray?

Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hey everyone,

I am wondering how to control the inner loop gradient update in MAML. In PPO, we have the parameters train_batch_size, sgd_minibatch_size, and num_sgd_iter to control the batchsize and number of SGD iterations for the training process.

In MAML however, we have the parameters inner_adaptation_steps and train_batch_size. How do they control the inner loop gradient update? My understanding is, that if inner_adaptation_steps = 1 we collect all the samples of 1 episode and then we perform a gradient update. But how is the update prcoess working exactly? Do we perform several iterations of updating with minibatches like in PPO? Or is it just 1 gradient update with all samples of 1 episode? And what is the role of train_batch_size then?

Thanks for your help!

Best,
Arthur

arturn · December 19, 2022, 11:07am

Hi @ElektroChan89 ,

I did not know this either, but num_sgd_iter does not affect MAML, because it uses RolloutWorker.learn_on_batch().
The update process works with workers.sync_weights(). So before and after each inner loop, weights are synced afaics.
train_batch_size does not play a role here, neither does sgd_minibatch_size
inner_adaptation_steps controls how many batches to optimize on before we return to the outer loop. Note that the logic for this in our code is an execution_plan - an API that we have been working to deprecate. That means that itr in the inner loop is an iterator that the inner loop pulls batches from until it has stepped “enough”.

Topic		Replies	Views
Confusing behavior in PPO training loop (train_batch_size, sgd_minibatch_size, num_sgd_iter) RLlib	1	497	July 27, 2022
[RLlib] Ray RLlib config parameters for PPO RLlib	8	7422	April 28, 2021
[RLlib] updating batch_size or similar while training RLlib	4	681	May 19, 2021
Reproducibility of ray.tune with seeds RLlib	6	2964	July 26, 2022
MAML finetune adaptation step for inference RLlib	5	584	January 13, 2022

Controlling inner loop of MAML

Related topics