Training DreamerV3 with CartPole gets stuck after an hour of training:
from ray.rllib.algorithms.dreamerv3 import DreamerV3Config
config = (
DreamerV3Config()
.environment("CartPole-v1")
.training(
model_size="XXS",
training_ratio=1,
# TODO
model={
"batch_size_B": 1,
"batch_length_T": 1,
"horizon_H": 100000,
"gamma": 0.997,
"model_size": "XXS",
},
)
)
config = config.resources(num_learner_workers=0, num_cpus_per_worker=8)
algo = config.build()
while True:
results = algo.train()
print("Start training")
print(f"num_env_steps_sampled: {results['num_env_steps_sampled']}")
print(f"Agent time steps total: {results['agent_timesteps_total']}")
print(f"Episodes total: {results['episodes_total']}")
if results['num_env_steps_sampled'] >= 100000:
break
This is the output in the terminal towards the end:
Start training
num_env_steps_sampled: 46081
Agent time steps total: 46081
Episodes total: 2407
Start training
num_env_steps_sampled: 47105
Agent time steps total: 47105
Episodes total: 2476
Start training
num_env_steps_sampled: 48129
Agent time steps total: 48129
Episodes total: 2544
The run is stuck at 2544 episodes, this is the output in Tensorboard:
Running on Ubuntu 22.04 with Ray 2.9.2 and Gymnasium 0.28.1