Will the added model be saved and loaded?

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Reinforcement learning is performed using the SAC algorithm in rllib ver 2.3.

Since image information is used as input, the model is connected to CNN and MLP.

In my experiments, I am adding a VAE model to the training by adding a Decoder as well with the CNN as the Encoder.

In that case, when I perform resuming and relearning, it appears that the model is not properly resumed with all model parameters in behavior.

Therefore, I suspect that if we add a new model (VAE in this case), there may be a point where the network parameters are not being saved or loaded properly.

Am I correct in my assumption? If so, what code modifications would need to be made?

Hi @Halman,

If you look at the RLlib code, we do save and load all the states.

It would be helpful if you could share a minimal-repro (not too complicated and reproducible) to validate your hypothesis. Something like a unittest and we could help you from there.

1 Like

@kourosh

Thank you for answering! I got it.

If all model parameters are stored and loaded, it is unclear what is causing the opposite.

I cannot share my code in its entirety due to confidentiality issues.

However, I can share some of the results of the training to aid in the discussion.

As you can see, I restarted the second study just before the 3Mstep, and I can see that there is a big disconnect between actor_loss and td_error.
On the other hand, for reward, the value drops a little, but it immediately returns to the saturation value of the first learning.

This seems a bit strange behavior, but can these be reasonably interpreted?

Sincerely,

I forgot to post the picture.

Hi @Halman, I would recommend going to the code snippet kourosh mentioned above and printing the model’s state_dict in both get_weights and set_weights, as a starter. Check if the VAE layers are present in the state_dict printed.

1 Like

@Rohan138 @kourosh

Thank you for your comment.
I found that the set_weights function in Torch_policy_v2.py was not called when I did the resume. Does this indicate that the resume is not working? Or is it possible that some other .py file is handling the resume? (I am using ray ver 2.3 right now)

Right now I am resuming training with resume=Tru picture as follows, is there anything I am doing incorrectly?

    results = tune.run(
        "SAC",
        stop=stop,
        config=config,
        verbose=True,
        checkpoint_at_end=True,
        local_dir=ARGS.exp,
        resume=True,
        fail_fast="raise",
        checkpoint_freq=1000,

Ah, resume will restart the trial, but not restore the weights. To do the latter, you need to pass in restore=path_to_your_checkpoint, I believe. Also, if you upgrade to Ray 2.5.1, you can instead use the ray.tune.Tuner class, which is better documented and maintained; tune.run will be deprecated shortly.