Restoring APEX_DDPG trainer using checkpoint saved with older ray version

giacomonil · May 28, 2021, 10:52am

Hi,

I am trying to upgrade some code that has been using Ray 0.8.1 to Ray 1.3.0.

I am now trying to restore an APEX_DDPG trainer using a checkpoint stored with Ray 0.8.1, but I get the following error:

Traceback (most recent call last):
File "rollout_and_plot.py", line 52, in <module>
  ro.run(test_config, config)
File "/home/ubuntu/investiva/stocktrade/evaluate/rollout.py", line 61, in run 
  agent.restore(model_path)    
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/tune/trainable.py", line 372, in restore
  self.load_checkpoint(checkpoint_path)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site/packages/ray/rllib/agents/trainer.py", line 755, in load_checkpoint
  self.__setstate__(extra_data)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 191, in __setstate__
  Trainer.__setstate__(self, state)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 1320, in __setstate__
  self.workers.local_worker().restore(state["worker"])
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1061, in restore
  self.policy_map[pid].set_state(state)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/policy/tf_policy.py", line 489, in set_state
  optimizer_vars = state.pop("_optimizer_variables", None)
TypeError: pop() takes at most 1 argument (2 given)

I have found a similar issue here for a different algorithm, which was solved with this PR. Also, when I train (and store checkpoints) using ray.tune(‘APEX_DDPG’) with the current version, I do not get this error when trying to restore the trainer class.

Is this issue related to dict/list type or to an incompatibility with checkpoint restoring between Ray versions?

Thanks in advance for your help.

Topic		Replies	Views
After trainer.train(), there is a bug coming out RLlib	4	740	July 11, 2022
Crash when calling .train() after loading from checkpoint RLlib	2	406	February 9, 2022
Error creating RLPredictor using restored checkpoint RLlib	5	468	April 2, 2023
Unable to restore fully trained checkpoint RLlib	19	2940	October 21, 2023
Ray 2.9 can't load a checkpoint stored with Ray 2.5 RLlib	2	270	October 18, 2024

Restoring APEX_DDPG trainer using checkpoint saved with older ray version

Related topics