Hi,
I am trying to upgrade some code that has been using Ray 0.8.1 to Ray 1.3.0.
I am now trying to restore an APEX_DDPG trainer using a checkpoint stored with Ray 0.8.1, but I get the following error:
Traceback (most recent call last):
File "rollout_and_plot.py", line 52, in <module>
ro.run(test_config, config)
File "/home/ubuntu/investiva/stocktrade/evaluate/rollout.py", line 61, in run
agent.restore(model_path)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/tune/trainable.py", line 372, in restore
self.load_checkpoint(checkpoint_path)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site/packages/ray/rllib/agents/trainer.py", line 755, in load_checkpoint
self.__setstate__(extra_data)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 191, in __setstate__
Trainer.__setstate__(self, state)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 1320, in __setstate__
self.workers.local_worker().restore(state["worker"])
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1061, in restore
self.policy_map[pid].set_state(state)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/policy/tf_policy.py", line 489, in set_state
optimizer_vars = state.pop("_optimizer_variables", None)
TypeError: pop() takes at most 1 argument (2 given)
I have found a similar issue here for a different algorithm, which was solved with this PR. Also, when I train (and store checkpoints) using ray.tune(‘APEX_DDPG’) with the current version, I do not get this error when trying to restore the trainer class.
Is this issue related to dict/list type or to an incompatibility with checkpoint restoring between Ray versions?
Thanks in advance for your help.