Policy weights overwritten in self-play

george_sk · July 7, 2021, 10:15am

Hi @mannyv and @sven1977 ,

I want to help me clarify something on weight syncing on this example.

As I understand when I set weights for a policy I have to do the syncing manually. In this topic, https://discuss.ray.io/t/board-game-self-play-ppo/1425/13 Sven had proposed:

trainer.set_weights({"shared_policy_2": self.men[sel]})  #loading weights 

weights = ray.put(trainer.workers.local_worker().save())
trainer.workers.foreach_worker(
lambda w: w.restore(ray.get(weights))
            )

or

trainer.set_weights({"shared_policy_2": self.men[sel]})  #loading weights 

local_weights = trainer.workers.local_worker().get_weights()
trainer.workers.foreach_worker(lambda worker: worker.set_weights(local_weights))

and Manny here said he had an error with the first method and proposed:

trainer.set_weights({"shared_policy_2": self.men[sel]})

weights = ray.put(trainer.get_weights("shared_policy_2"))
trainer.workers.foreach_worker(
lambda w: w.set_weights(ray.get(weights))
            )

I want to ask if all these method are equivalent (meaning that they sync the weights).
Especially for Manny’s method since it does:

weights = ray.put(trainer.get_weights("shared_policy_2"))

does this sync the weights only for “shared_policy_2” or overwrites the weights of “shared_policy_1” with the ones of “shared_policy_2”, since it is applied on all the workers? Because it seems to me like that, but I am not sure how it works.
Also, all three methods run on my pc without errors.

Thanks,
George

Topic		Replies	Views
Fails restoring weights #41508 RLlib	2	419	December 29, 2023
Rllib multi agent connect 4 issues - why does it 'forget' what it learnt? RLlib	0	244	November 27, 2023
Evaluating multi-agent policies trained with self-play RLlib	2	554	March 16, 2022
Self-play modifications via callbacks RLlib	4	503	February 24, 2023
Question regarding example self-play implementation RLlib	3	402	February 16, 2024

Policy weights overwritten in self-play

Related topics