Multiple minor issues with Ray tune

I am having multiple minor issues with ray tune which affect the performance of HP tuning. They appeared from my first tries at implementing Ray tune. I am new to ray tune and parameter optimization, so, it could be that I am doing smth wrong, could someone double-check it with me before rising any issues on github. Thanks!

  1. raise ValueError('The argument kernel_size cannot contain 0(s).
    It is raised by the TensorFlow because the PBT algorithm generates it during the mutation process despite the fact that I specified search space which excludes zero.

hyperparam_mutations = { …
‘kernel’ : lambda: np.random.randint(1,3),

  1. "no checkpoint for trial" - it is already being discussed here
    Intended behaviour or smth wrong with my code? "no checkpoint for trial. Skip exploit for Trial" - #7 by eaakimov

3. This one is strange. When I added a custom loss function ray started to (A) make losses negative and (B) apparently switching to mode=“max”

2021-03-25 16:39:07,319 INFO – [exploit] transferring weights from trial fit_tuner_a5f17_00003 (score -0.7045304775238037) → fit_tuner_a5f17_00000 (score -0.9259571433067322)

Current best trial: a5f17_00000 with val_loss=0.6893951892852783 and parameters=…

However, in all other places, the loss values appear as expected (positive).

  1. custom_explore_fn does not seem to work, at least how I think it should:

def explore(config):
if config[‘activation’] == ‘lrelu’:
config[“activation”] = LeakyReLU (alpha = 0.05)
return config

scheduler = PopulationBasedTraining(

  		 hyperparam_mutations = self.hypers,
                    custom_explore_fn = explore )

Gives error: ‘lrelu’ not defined which means that config[“activation”] NOT equal LeakyReLU (alpha = 0.05) when model is being optimized. However, if I do it in the trainable - it works.

I know these look strange, for me, at least. I would appreciate any input! Let me know if I need to attach any code.

@rliaw @kai please take a look!

Hmm @amogkam when you have the chance, could you take a look into this?