How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hi,
Thank you for an easy-to-use Ray Tune. I am new to Ray Tune and I am trying to use it to tune parameter values of a iterative non-learnable method. The method has lots of parameters so it will take a while to optimize all parameter values. However, the machine that I would like to run Ray Tune automatically terminates a program after it ran for 72 hours. I would like Ray Tune to terminate properly a few minutes before it hits 72 hours. I have done the following
result=tune.run(
tune.with_parameters(partial(optimize, args, relative_data_paths)),
name=args.tuning_exp_name,
resources_per_trial={"cpu": args.n_cpus, "gpu": args.n_gpus},
config=config,
stop={'time_total_s':args.stop_time_total_h*3600},
num_samples=args.n_samples, # number of trials
scheduler=ASHAScheduler(),
metric='score',
mode=args.tuning_mode,
fail_fast=True, # To stop the entire Tune run as soon as any trial errors
log_to_file=True # save stdout and stderr to trial_logdir/stdout and trial_logdir/stderr
)
where args
is a command-line input. args.stop_time_total_h is in hour. To test whether the optimization stops after certain time, I tested with args.stop_time_total_h=0.05
which is 3 minutes. It seemed Ray Tune ran all the trials regardless of stop={'time_total_s':args.stop_time_total_h*3600}
.
Could anyone tell whether I did something wrong?