Restoring APEX_DDPG trainer using checkpoint saved with older ray version

Hi,

I am trying to upgrade some code that has been using Ray 0.8.1 to Ray 1.3.0.

I am now trying to restore an APEX_DDPG trainer using a checkpoint stored with Ray 0.8.1, but I get the following error:

Traceback (most recent call last):
File "rollout_and_plot.py", line 52, in <module>
  ro.run(test_config, config)
File "/home/ubuntu/investiva/stocktrade/evaluate/rollout.py", line 61, in run 
  agent.restore(model_path)    
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/tune/trainable.py", line 372, in restore
  self.load_checkpoint(checkpoint_path)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site/packages/ray/rllib/agents/trainer.py", line 755, in load_checkpoint
  self.__setstate__(extra_data)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 191, in __setstate__
  Trainer.__setstate__(self, state)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 1320, in __setstate__
  self.workers.local_worker().restore(state["worker"])
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1061, in restore
  self.policy_map[pid].set_state(state)
File "/home/ubuntu/miniconda3/envs/pcv/lib/python3.6/site-packages/ray/rllib/policy/tf_policy.py", line 489, in set_state
  optimizer_vars = state.pop("_optimizer_variables", None)
TypeError: pop() takes at most 1 argument (2 given)

I have found a similar issue here for a different algorithm, which was solved with this PR. Also, when I train (and store checkpoints) using ray.tune(‘APEX_DDPG’) with the current version, I do not get this error when trying to restore the trainer class.

Is this issue related to dict/list type or to an incompatibility with checkpoint restoring between Ray versions?

Thanks in advance for your help.