PPO - Load checkpoint from previous version fails

2dm · March 4, 2022, 9:03pm

Hi,

I am trying to load a checkpoint of ppo_trainer created on v1.6 after upgrading to 1.10, but getting the following error:

AttributeError: Can't get attribute 'KLCoeffMixin' on <module 'ray.rllib.agents.ppo.ppo_torch_policy

I see that this class was deprecated with build_policy, but can’t think of how to solve it.
Is there any work around I can do to load this checkpoint successfully?

Thanks!

avnishn · March 16, 2022, 1:53am

Ah, we are unable to support checkpointing across different versions of RLlib at the moment, but we’re working on a solution for this, and should be able to soon. In the meantime, you’ll need to uncheckpoint your policy using v1.6.

2dm · March 17, 2022, 6:44pm

Thanks for you answer @avnishn. Hope you’ll be able to publish it soon.

To anyone else who needs a solution - since I only wish to test my agent, for now I’m saving only the weights and loading them in the new version.

# In the old version
weight = agent.get_policy().get_weights()
with open('my_weights.pickle', 'wb') as handle:
    pickle.dump(weight, handle, protocol=pickle.HIGHEST_PROTOCOL)

# In the new version
with open('my_weights.pickle', 'rb') as handle:
    weights = pickle.load(handle)
    agent.get_policy().set_weights(weights)

Topic		Replies	Views
PPO from checkpoint Checkpointing, Restoring	0	47	September 10, 2024
Error when loading and restoring a trained algorithm from a checkpoint using a APPO Algorithm RLlib	1	348	February 14, 2023
Ray 2.9 can't load a checkpoint stored with Ray 2.5 RLlib	2	269	October 18, 2024
Ray.rllib.agents.ppo missing RLlib	3	7594	March 27, 2023
Unable to load trained RL Model with Ray Train RLlib	1	49	February 18, 2025

PPO - Load checkpoint from previous version fails

Related topics