I am running on an M1 Mac Pro (32 GB ram, 10 CPU) if it matters. Ray RLLib code
When I run experiments sometimes they get stuck. In a training loop, I won’t get a print statement to say the loop is finished. I have tried new python environments, restarting, different Gym environments, different config Algorithms, etc, but the issue persists. I am using very basic code so I don’t understand what could be wrong. Here are two versions that both have shown this behavior.
# import statements for both codes
from ray.rllib.algorithms.ppo import PPOConfig
from ray.rllib.algorithms.dqn import DQNConfig
from ray.rllib.algorithms.a3c import A3CConfig
import time
import ray
import tensorflow as tf
import numpy as np
import ray.tune as tune
from ray import air
env_name = "MountainCar-v0"
ray.init()
num_rollout_workers = 7
config = (
DQNConfig()
.environment(env_name)
.rollouts(num_envs_per_worker=1, num_rollout_workers=num_rollout_workers)
.framework("tf2")
.training(model={"fcnet_hiddens": [32, 32], "fcnet_activation": 'relu'}, train_batch_size=512\
, lr=0.0001, gamma=0.95)
.evaluation(evaluation_num_workers=1, evaluation_duration=300, evaluation_duration_unit='timesteps')
)
run_tuner = tune.Tuner("DQN"
, run_config=air.RunConfig(stop={"episode_reward_mean": -120, "training_iteration": 3})
, param_space=config
)
results = run_tuner.fit()
and
ray.init()
num_rollout_workers = 7
num_iters = 50
config = ( # 1. Configure the algorithm,
DQNConfig()
.environment(env_name)
.rollouts(num_envs_per_worker=1, num_rollout_workers=num_rollout_workers)
.framework("tf2")
.training(model={"fcnet_hiddens": [32, 32], "fcnet_activation": 'relu'}, train_batch_size=512\
, lr=tune.grid_search([0.001,0.0001, 0.00001]), gamma=0.95)
.evaluation(evaluation_num_workers=1, evaluation_duration=300, evaluation_duration_unit='timesteps')
)
algo = config.build() # 2. build the algorithm,
for _ in range(num_iters):
algo.train() # 3. train it,
print(f"Step {_} done")
if _ % 50 == 0:
print(algo.evaluate())
Is there a bug in my code? Are there logs I can check? Or is there some setting that allows me to check if a step is taking a long time; and if it is, to skip that step, reset the environments, and then continue training?
Here are a few main details from my python env:
ray==2.4.0
python==3.8.10 #via conda-forge
tensorflow-macos==2.12.0
tensorflow-metal==0.8.0
grpcio==1.49.1 # via conda-forge as install instructions recommend. Also used 1.49.1 since tf requires a later version, but have tried downgrading tf and grpcio as well.
gymnasium==0.26.3
When I go to tensorflow, it also shows that the system is stuck somewhere as new results are not being added. I have left the system going overnight without a single new iteration.