Behaviour of trainer.set_weights in different callbacks

george_sk · June 21, 2021, 11:06am

Hi all!

I suppose that if I want to save the policy weights created with get_policy().get_weights() the on_train_result callback is used.

But, if I want to use previously saved weights with trainer.set_weights, if I use it on the on_train_result callback are these weights used in the next training iteration (since the training results have already been produced)?

What is the difference if I use the trainer.set_weights in the on_postprocess_trajectory callback?

In this github issue https://github.com/ray-project/ray/issues/6669 the on_train_result callback is referred as the appropriate for setting weights, but is still not clear to me why.

Thanks in advance.
George

mannyv · June 21, 2021, 6:25pm

Hi @george_sk ,

Using on_train_result would get/update the weights after the training step and before the next rollout (experience collection step).

Using on_postprocess_trajectory would do this after every rollout_fragment_length steps or full steps (if it is shorter) of an episode. So this would happen multiple times in an iteration before the weights were updated.

george_sk · June 22, 2021, 1:56pm

Thanks @mannyv for your reply.
I see, so the ideal callback for setting weights is the on_train_result.

Topic		Replies	Views
Is there a way to log the weight changes of neural networks? RLlib	4	654	December 8, 2021
Policy weights overwritten in self-play RLlib	14	1006	July 14, 2021
Advanced evaluation with wandb, RLlib and Tune (weight, gradient, activation histogram) RLlib	1	763	March 21, 2022
Rollout workers spend too much time on set_weights() RLlib	1	294	November 30, 2022
How to get the weight? RLlib	2	1013	June 30, 2021

Behaviour of trainer.set_weights in different callbacks

Related topics