Ray Tune was recommended to me by Nico Pinto (he was the first person to train NNs on GPUs, and taught Alex how to do it to set up AlexNet).
I am interested in Ray Tune early stopping (See “How does early termination (e.g. Hyperband/ASHA) work?” in Ray docs).
It appears you have a
grace_period that sets the minimum number of epochs, but not a
patience parameter (see ‘Early Stopping’ in Pytorch-Lightning documentation).
patience parameter is very useful because most ML algorithms have jittery objectives. You don’t want to terminate if one single epoch increases the objective temporarily.
Is there a way to implement
patience in Ray so I don’t have early stopping until convergence has finally been implemented? i.e. that Ray early terminates only if the objective doesn’t converge after a certain number of trials?
Unfortunately, this was an issue I had with Optuna (https://github.com/optuna/optuna/issues/1447) and that is one of the reasons I am considering Ray.
This might related to Tuning process with PBT is killed after a very small number of iterations (6/500))