How severe does this issue affect your experience of using Ray?
- Low: It annoys or frustrates me for a moment.
Greeting!
I’ve been trying to implement the MuZero agent. My implementation is strongly based on the muzero-general.
Unfortunately, when using Ray for running self-play games in parallel within test mode, the running time of my script was roughly the same no matter of how many self-play Actors I used. Since I’m totally new to Ray, I’m not sure if the way I was doing with it is correct.
Please help me out! Thank you in advanced and have a nice day!
Here is how I’m doing it
checkpoint = torch.load(os.path.join(self.config.log_dir, 'model.checkpoint'))
self_play_workers = [
SelfPlay.remote(deepcopy(self.game), checkpoint, self.config, self.config.seed + 10 * i)
for i in range(self.config.workers)
]
histories = []
for _ in tqdm(range(math.ceil(self.config.tests / self.config.workers)), desc=f'Testing'):
hs = [
ray.get(worker.play.remote(
0, # select actions with max #visits
self.config.opponent,
self.config.muzero_player,
self.config.render)
) for worker in self_play_workers
]
for h in hs:
histories.append(h)