If all model parameters are stored and loaded, it is unclear what is causing the opposite.
I cannot share my code in its entirety due to confidentiality issues.
However, I can share some of the results of the training to aid in the discussion.
As you can see, I restarted the second study just before the 3Mstep, and I can see that there is a big disconnect between actor_loss and td_error.
On the other hand, for reward, the value drops a little, but it immediately returns to the saturation value of the first learning.
This seems a bit strange behavior, but can these be reasonably interpreted?
Hi @Halman, I would recommend going to the code snippet kourosh mentioned above and printing the model’s state_dict in both get_weights and set_weights, as a starter. Check if the VAE layers are present in the state_dict printed.
Thank you for your comment.
I found that the set_weights function in Torch_policy_v2.py was not called when I did the resume. Does this indicate that the resume is not working? Or is it possible that some other .py file is handling the resume? (I am using ray ver 2.3 right now)
Right now I am resuming training with resume=Tru picture as follows, is there anything I am doing incorrectly?
Ah, resume will restart the trial, but not restore the weights. To do the latter, you need to pass in restore=path_to_your_checkpoint, I believe. Also, if you upgrade to Ray 2.5.1, you can instead use the ray.tune.Tuner class, which is better documented and maintained; tune.run will be deprecated shortly.