DreamerV3 hangs when using a loop for multiple training sessions

Hi,
I’ve been testing the DreamerV3 algo with a custom Gymnasium environment inside a Jupyter Notebook, if I train using single instances, everything runs smoothly but the moment I use a loop for multiple training sessions, the algo keeps hanging, e.g.:

from ray.rllib.algorithms.dreamerv3 import DreamerV3, DreamerV3Config

num_sessions = 10

for session in range(num_sessions):
    print(f"Starting training session {session+1}")

    config = (
        DreamerV3Config()
        .environment("test_env_v0")
        .training(
            model_size="XS",
            training_ratio=1,
            model={
                "batch_size_B": 1,
                "batch_length_T": 1,
                "horizon_H": 1,
                "gamma": 0.997,
                "model_size": "XS",
            },
        )
    )

    algo = config.build()

    for i in range(100):
        result = algo.train()
        print(f"Iteration: {i+1} Timesteps total: {result['agent_timesteps_total']} Steps trained: {result['num_env_steps_trained']} Episodes total: {result['episodes_total']}")
Starting training session 1
... 
Iteration: 98 Timesteps total: 99329 Steps trained: 100352 Episodes total: 4138
Iteration: 99 Timesteps total: 100353 Steps trained: 101376 Episodes total: 4181
Iteration: 100 Timesteps total: 101377 Steps trained: 102400 Episodes total: 4224

Starting training session 2
...
Iteration: 10 Timesteps total: 9217 Steps trained: 10240 Episodes total: 384
Iteration: 11 Timesteps total: 10241 Steps trained: 11264 Episodes total: 426
Iteration: 12 Timesteps total: 11265 Steps trained: 12288 Episodes total: 469
Iteration: 13 Timesteps total: 12289 Steps trained: 13312 Episodes total: 512
Iteration: 14 Timesteps total: 13313 Steps trained: 14336 Episodes total: 554
Iteration: 15 Timesteps total: 14337 Steps trained: 15360 Episodes total: 597
Iteration: 16 Timesteps total: 15361 Steps trained: 16384 Episodes total: 640

The second session is hanging after 16 iterations.

What is the default workflow to run multiple training sessions? Should one just avoid using Jupyter Notebooks?

1 Like

Anyone else is experiencing something similar?