MADDPG weights in a list instead of dict

I get this error with maddpg :

  File "*/lib/python3.7/site-packages/ray/tune/function_runner.py", line 248, in run
    self._entrypoint()
  File "*/lib/python3.7/site-packages/ray/tune/function_runner.py", line 316, in entrypoint
    self._status_reporter.get_checkpoint())
  File "*/lib/python3.7/site-packages/ray/tune/function_runner.py", line 580, in _trainable_func
    output = fn()
  File "*/lib/python3.7/site-packages/rlfw/train/run_utils.py", line 126, in run_xp
    latest_result = trainer.train()
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 576, in train
    raise e
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 562, in train
    result = Trainable.train(self)
  File "*/lib/python3.7/site-packages/ray/tune/trainable.py", line 232, in train
    result = self.step()
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 171, in step
    evaluation_metrics = self._evaluate()
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 807, in _evaluate
    self._sync_weights_to_workers(worker_set=self.evaluation_workers)
  File "*lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 860, in _sync_weights_to_workers
    worker_set.foreach_worker(lambda w: w.restore(ray.get(weights)))
  File "*/lib/python3.7/site-packages/ray/rllib/evaluation/worker_set.py", line 160, in foreach_worker
    local_result = [func(self.local_worker())]
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 860, in <lambda>
    worker_set.foreach_worker(lambda w: w.restore(ray.get(weights)))
  File "*/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1061, in restore
    self.policy_map[pid].set_state(state)
  File "*lib/python3.7/site-packages/ray/rllib/contrib/maddpg/maddpg_policy.py", line 313, in set_state
    TFPolicy.set_state(self, state)
  File "*/lib/python3.7/site-packages/ray/rllib/policy/tf_policy.py", line 489, in set_state
    optimizer_vars = state.pop("_optimizer_variables", None)
TypeError: pop() takes at most 1 argument (2 given)

Have you ever encountered this ? state here is a list of all my weights instead of a dict.

UPDATE : I put this in maddpg_policy.py get_weights method so I could get a dict :
@override(TFPolicy)
def get_weights(self):
var_list = []
for var in self.vars.values():
var_list += var
weights_dict = {}
weights = self.sess.run(var_list)
for i in range(len(weights)):
weights_dict[var_list[i]] = weights[0][i]
# return self.sess.run(var_list)
return weights_dict
I’m getting a numpy() is only available when eager execution is enabled error now.

Edit 2 : I think it might be because I use TF 2.X instead of TF 1.X

Hi @Clement_Collgon,

This should be fixed in the nightly build with this commit. If you try it will you let me know if it fixed your issue.

3 Likes

Hi @mannyv ,
Thank you very much for your message, it did fix my issue!

1 Like