MADDPG weights in a list instead of dict

Clement_Collgon · May 7, 2021, 8:35am

I get this error with maddpg :

  File "*/lib/python3.7/site-packages/ray/tune/function_runner.py", line 248, in run
    self._entrypoint()
  File "*/lib/python3.7/site-packages/ray/tune/function_runner.py", line 316, in entrypoint
    self._status_reporter.get_checkpoint())
  File "*/lib/python3.7/site-packages/ray/tune/function_runner.py", line 580, in _trainable_func
    output = fn()
  File "*/lib/python3.7/site-packages/rlfw/train/run_utils.py", line 126, in run_xp
    latest_result = trainer.train()
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 576, in train
    raise e
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 562, in train
    result = Trainable.train(self)
  File "*/lib/python3.7/site-packages/ray/tune/trainable.py", line 232, in train
    result = self.step()
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 171, in step
    evaluation_metrics = self._evaluate()
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 807, in _evaluate
    self._sync_weights_to_workers(worker_set=self.evaluation_workers)
  File "*lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 860, in _sync_weights_to_workers
    worker_set.foreach_worker(lambda w: w.restore(ray.get(weights)))
  File "*/lib/python3.7/site-packages/ray/rllib/evaluation/worker_set.py", line 160, in foreach_worker
    local_result = [func(self.local_worker())]
  File "*/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 860, in <lambda>
    worker_set.foreach_worker(lambda w: w.restore(ray.get(weights)))
  File "*/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1061, in restore
    self.policy_map[pid].set_state(state)
  File "*lib/python3.7/site-packages/ray/rllib/contrib/maddpg/maddpg_policy.py", line 313, in set_state
    TFPolicy.set_state(self, state)
  File "*/lib/python3.7/site-packages/ray/rllib/policy/tf_policy.py", line 489, in set_state
    optimizer_vars = state.pop("_optimizer_variables", None)
TypeError: pop() takes at most 1 argument (2 given)

Have you ever encountered this ? state here is a list of all my weights instead of a dict.

Clement_Collgon · May 7, 2021, 8:53am

UPDATE : I put this in maddpg_policy.py get_weights method so I could get a dict :
@override(TFPolicy)
def get_weights(self):
var_list = []
for var in self.vars.values():
var_list += var
weights_dict = {}
weights = self.sess.run(var_list)
for i in range(len(weights)):
weights_dict[var_list[i]] = weights[0][i]
# return self.sess.run(var_list)
return weights_dict
I’m getting a numpy() is only available when eager execution is enabled error now.

Edit 2 : I think it might be because I use TF 2.X instead of TF 1.X

mannyv · May 7, 2021, 10:36am

Hi @Clement_Collgon,

This should be fixed in the nightly build with this commit. If you try it will you let me know if it fixed your issue.

github.com/ray-project/ray

[RLlib][contrib][maddpg] update get/set_weights to use dictionaries

ray-project:master ← mvindiola1:14097_maddpg_restore_checkpoint

opened 05:37PM - 24 Mar 21 UTC

mvindiola1

+3 -2

## Why are these changes needed? maddpg get_weights is currently returnin…g a list of weights rather than a dictionary. This causes tf_policy.py set_state to fail when trying to get "optimizer_vars". This commit updates maddpg to use a dictionary rather than a list in get/set_weights. ## Related issue number Closes #14097 ## Checks - [x ] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [x ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( - [x] Tested with script in the github issue.

Clement_Collgon · May 7, 2021, 2:53pm

Hi @mannyv ,
Thank you very much for your message, it did fix my issue!

Topic		Replies	Views
MADDPG against pre-trained DQN agents RLlib	1	441	January 9, 2023
[MADDPG] using policies_to_train RLlib	0	804	June 3, 2021
Crash when calling .train() after loading from checkpoint RLlib	2	406	February 9, 2022
Trying to integrate gym environments with RLLIB RLlib	2	319	January 9, 2024
Example code failed---multi_agent_two_trainers.py RLlib	0	141	March 20, 2024

MADDPG weights in a list instead of dict

Related topics