How to train until convergence

carlorop · July 6, 2022, 10:06am

Here it is an example of the configuration I am using for running some models:



    experiment_params = {
            "training": {
                "env": "example",
                "run": args.algorithm,
                "stop": {"training_iteration": 2000
                         #"timesteps_total": 8000000,
                        
                        },
                "local_dir": "/opt/ml/output/intermediate",
                "checkpoint_at_end": True,
                "checkpoint_freq": 10,
                "config": {
                    "num_workers": int(os.cpu_count())-1,                    
                    #"sigma0": 0.01,
                    "lr": 0.001,
                    # "sample_batch_size": 1,
                    "num_gpus":  num_gpus,
                    "gamma": float(args.gamma),
                    "seed":args.seed,
                    # RAY 0.5 syntax: gpu_fraction
                    #"prioritized_replay": False,
                },

            }
        }
    ray.tune.run_experiments(copy.deepcopy(experiment_params))

I am using RLLIB 0.8.5, is there any way to change the stopping criteria such that it stops training once episode reward mean has converged?. I wouldn’t mind updating RLLIB

arturn · July 6, 2022, 4:01pm

Hi! Definition of convergence depends on the problem and algorithm. Tune can only stop on metrics that are reported to it. So you’ll have to implement a metric that digests a record of past mean rewards to provide a measurement of convergence for tune to observe.
Here’s what to do:

Write a custom Callback that tracks your metric of interest over time
1.1 Write metric (maybe a simple average of absolute change in episode reward mean) into
results
Tell tune to monitor that metric as a stopping metric

See here for example.

Topic		Replies	Views
Continue training after finishing first run RLlib	3	393	June 14, 2021
Stopping criteria for PPOTrainer RLlib	2	837	January 30, 2022
Callback after a trial has converged Ray Tune	7	598	November 11, 2022
Stop criteria using a custom metric RLlib	2	47	July 10, 2024
How to save the best checkpoint of the training using RLLIB RLlib	1	1000	March 23, 2022

How to train until convergence

Related topics