Behaviour of trainer.set_weights in different callbacks

Hi all!

I suppose that if I want to save the policy weights created with get_policy().get_weights() the on_train_result callback is used.

But, if I want to use previously saved weights with trainer.set_weights, if I use it on the on_train_result callback are these weights used in the next training iteration (since the training results have already been produced)?

What is the difference if I use the trainer.set_weights in the on_postprocess_trajectory callback?

In this github issue https://github.com/ray-project/ray/issues/6669 the on_train_result callback is referred as the appropriate for setting weights, but is still not clear to me why.

Thanks in advance.
George

Hi @george_sk ,

Using on_train_result would get/update the weights after the training step and before the next rollout (experience collection step).

Using on_postprocess_trajectory would do this after every rollout_fragment_length steps or full steps (if it is shorter) of an episode. So this would happen multiple times in an iteration before the weights were updated.

Thanks @mannyv for your reply.
I see, so the ideal callback for setting weights is the on_train_result.