Here it is an example of the configuration I am using for running some models:
experiment_params = {
"training": {
"env": "example",
"run": args.algorithm,
"stop": {"training_iteration": 2000
#"timesteps_total": 8000000,
},
"local_dir": "/opt/ml/output/intermediate",
"checkpoint_at_end": True,
"checkpoint_freq": 10,
"config": {
"num_workers": int(os.cpu_count())-1,
#"sigma0": 0.01,
"lr": 0.001,
# "sample_batch_size": 1,
"num_gpus": num_gpus,
"gamma": float(args.gamma),
"seed":args.seed,
# RAY 0.5 syntax: gpu_fraction
#"prioritized_replay": False,
},
}
}
ray.tune.run_experiments(copy.deepcopy(experiment_params))
I am using RLLIB 0.8.5, is there any way to change the stopping criteria such that it stops training once episode reward mean has converged?. I wouldn’t mind updating RLLIB