Tune.run() doesn't work. runs endlessly

Hi,
I’m having trouble training a ppo model to get my agent do a very simple job. my rl problem is a single car which has to be moved to the next road in a road network until it reaches its destination, then once reached, it will be given a reward by 1.
I have defined my own custom gymnasium environment. it is working well when I tested the env.
When I use Stable Baselines3 to train a ppo model, it seems all good and well and produces a nice result where i can visualize it with tensorboard. So it confirms that my defined environment is working properly.
But, when I use rllib, ray tune.run(), it runs the simulation endlessly even if I tried many ways to force it stop after a certain num of timesteps or iteration, but no dice!
Here is the config:

def main():
ray.init(ignore_reinit_error=True)
register_env(‘car_env/car-v0’, create_env)

custom_config = {
“lr”: 0.0001, # Learning rate
“entropy_coeff”: 0.01, # Entropy coefficient
“num_steps”: 1, # Number of steps or iterations
}
log_dir = ‘E:\My files\sumo\my example\Car Control\experiments\rllib’
max_iterations = 0
stopper = MaximumIterationStopper(max_iter=max_iterations)
config = {
“env”:“car_env/car-v0”,
“framework”: “torch”,
“num_envs_per_worker”: 1,
“seed”: 123,
‘log_level’:‘ERROR’,
‘ignore_worker_failures’:False,
‘lr_schedule’:[[0,1e-1],[int(1e2), 1e-2],[int(1e3), 1e-3]],
# “evaluation_interval”:2,
# “evaluation_num_episodes”:4,
“num_gpus”:0,
“num_rollout_workers”:1,
# “num_evaluation_workers”:1,
**custom_config,
}
anlysis = tune.run(
“PPO”,
name = ‘experiment1’,
config=config,
# stop = {
# ‘training_iteration’:1,
# “episode_reward_mean”:1,
# ‘timesteps_total’:2,
# },
local_dir = log_dir+‘/net2’,
checkpoint_at_end=True,
resume=False,
stop=stopper,
)
if name == ‘main’:
main()

I’m not even sure if it has started the training, but for sure, it is running the simulation and moving the car in sumo simulator.

just ignore the warning: no connection between edges … and … because I wrote a line of code to truncate the simulation if the car is directed to a road(edge) where there is no way to its final destination. so this is part of the problem statement.

It seems like your Ray Tune training is running indefinitely and not stopping as expected. This could be due to a few reasons:

  1. Stopping condition: In your code, you’ve commented out the stop parameter in the tune.run() function. This parameter is used to specify the stopping criteria for the training. If it’s not provided, the training will run indefinitely. You can specify stopping criteria like a maximum number of iterations, a minimum or maximum reward, etc. For example, to stop after 100 iterations, you can use stop={"training_iteration": 100}.

  2. MaximumIterationStopper: You’re using a MaximumIterationStopper with max_iter set to 0. This might be causing the training to run indefinitely. Try setting max_iter to a larger value.

  3. Training not starting: If you’re not sure whether the training has started, you can check the logs produced by Ray. If the training has started, you should see logs indicating the progress of the training.

Here’s how you can modify your code to include a stopping condition:

def main():
    # ... rest of your code ...

    stopper = MaximumIterationStopper(max_iter=100)  # Set max_iter to a larger value

    config = {
        # ... rest of your config ...
    }

    analysis = tune.run(
        "PPO",
        name='experiment1',
        config=config,
        local_dir=log_dir+'/net2',
        checkpoint_at_end=True,
        resume=False,
        stop=stopper,  # Make sure to include the stopper
    )

if __name__ == '__main__':
    main()

For more information on how to define stopping criteria in Ray Tune, you can refer to the Ray documentation.