Handling NaN during population based training

Hi all,

I am wondering the best way to handle NaNs during population based training. I am tuning learning rates and the model is fairly sensitive to changes which can result in NaN for val loss. Can trials recover after hitting a NaN? Or is it best to add a stopping criteria e.g. (stop={"val_loss": nan}) that will kill the NaN trial and start another?

Thanks!

Hmm, maybe you could return a very bad score if you get a NaN, in which case you would let PBT automatically terminate that trial.

That has worked pretty well so far — definitely a better solution than using the stopping criteria. Thanks!

1 Like