I am having multiple minor issues with ray tune which affect the performance of HP tuning. They appeared from my first tries at implementing Ray tune. I am new to ray tune and parameter optimization, so, it could be that I am doing smth wrong, could someone double-check it with me before rising any issues on github. Thanks!
- raise ValueError('The argument
kernel_size
cannot contain 0(s).
It is raised by the TensorFlow because the PBT algorithm generates it during the mutation process despite the fact that I specified search space which excludes zero.
hyperparam_mutations = { …
‘kernel’ : lambda: np.random.randint(1,3),
…}
- “no checkpoint for trial” - it is already being discussed here
Intended behaviour or smth wrong with my code? "no checkpoint for trial. Skip exploit for Trial" - #7 by eaakimov
3. This one is strange. When I added a custom loss function ray started to (A) make losses negative and (B) apparently switching to mode=“max”
2021-03-25 16:39:07,319 INFO pbt.py:532 – [exploit] transferring weights from trial fit_tuner_a5f17_00003 (score -0.7045304775238037) → fit_tuner_a5f17_00000 (score -0.9259571433067322)
Current best trial: a5f17_00000 with val_loss=0.6893951892852783 and parameters=…
However, in all other places, the loss values appear as expected (positive).
- custom_explore_fn does not seem to work, at least how I think it should:
def explore(config):
if config[‘activation’] == ‘lrelu’:
config[“activation”] = LeakyReLU (alpha = 0.05)
return config
scheduler = PopulationBasedTraining(
time_attr="training_iteration", perturbation_interval=self.tuner_params['pert_int'], hyperparam_mutations = self.hypers, custom_explore_fn = explore )
Gives error: ‘lrelu’ not defined which means that config[“activation”] NOT equal LeakyReLU (alpha = 0.05) when model is being optimized. However, if I do it in the trainable - it works.
I know these look strange, for me, at least. I would appreciate any input! Let me know if I need to attach any code.
Thanks!