I am trying to upgrade some code that has been using Ray 0.8.1 to Ray 1.3.0.
I am now trying to restore an APEX_DDPG trainer using a checkpoint stored with Ray 0.8.1, but I get the following error:
Traceback (most recent call last): File "rollout_and_plot.py", line 52, in <module> ro.run(test_config, config) File "/home/ubuntu/investiva/stocktrade/evaluate/rollout.py", line 61, in run agent.restore(model_path) File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/tune/trainable.py", line 372, in restore self.load_checkpoint(checkpoint_path) File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site/packages/ray/rllib/agents/trainer.py", line 755, in load_checkpoint self.__setstate__(extra_data) File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 191, in __setstate__ Trainer.__setstate__(self, state) File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 1320, in __setstate__ self.workers.local_worker().restore(state["worker"]) File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1061, in restore self.policy_map[pid].set_state(state) File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/policy/tf_policy.py", line 489, in set_state optimizer_vars = state.pop("_optimizer_variables", None) TypeError: pop() takes at most 1 argument (2 given)
I have found a similar issue here for a different algorithm, which was solved with this PR. Also, when I train (and store checkpoints) using ray.tune(‘APEX_DDPG’) with the current version, I do not get this error when trying to restore the trainer class.
Is this issue related to dict/list type or to an incompatibility with checkpoint restoring between Ray versions?
Thanks in advance for your help.