Hello,
I 've recently started learning rllib and tune, and tried to run the following example script from a book I 've been following:
import ray
from ray.rllib.agents.ppo.ppo import PPOTrainer
from ray.tune.logger import pretty_print
import random
if __name__ == "__main__":
ray.shutdown()
ray.init(num_cpus=3, ignore_reinit_error=True)
env = TicTacToe()
num_policies = 4
policies = {
"policy_{}".format(i): (None, env.observation_space, env.action_space, {})
for i in range(num_policies)
}
policy_ids = list(policies.keys())
config = {
"multiagent": {
"policies": policies,
"policy_mapping_fn": (lambda agent_id: random.choice(policy_ids)),
},
"framework": "tf",
"num_workers": 4,# Adjust this according to the number of CPUs on your machine.
}
trainer = PPOTrainer(env=TicTacToe, config=config)
best_eps_len = 0
mean_reward_thold = -1
while True:
results = trainer.train()
print(pretty_print(results))
if results["episode_reward_mean"] > mean_reward_thold and results["episode_len_mean"] > best_eps_len:
trainer.save("ttt_model")
best_eps_len = results["episode_len_mean"]
print("--------------------- MODEL SAVED!")
if results.get("timesteps_total") > 10 ** 7:
break
ray.shutdown()
With the code referring to TicTacToe being copy-pasted from over here ( Mastering-Reinforcement-Learning-with-Python/tic_tac_toe.py at master · PacktPublishing/Mastering-Reinforcement-Learning-with-Python · GitHub) which I however will not post here for brevity.
The thing is every time I try to run it, be it in Colab or Kaggle (with 2 and 4 CPUs available respectively, I get the following:
(scheduler +1h27m38s) Warning: The following resource request cannot be scheduled right now: {'CPU': 1.0}. This is likely due to all cluster resources being claimed by actors. Consider creating fewer actors or adding more nodes to this Ray cluster.
I tried changing the number of workers but it did not help at all and also trying to run it locally where I have 4 CPU cores but to no avail. What can I do about that? Is there no way to run multi-agent settings with access to only 2 or 4 Cores?