Conditioned hyperparam_mutations ray/tune

First of all, thanks to the whole Ray team for the amazing job they are doing!

My intention is to optimize the number of Conv1D layers and the kernel size in each layer using PBT and Keras, therefore, I assume my hyperparam_mutations should take the form like -
“layers” : tune.choice([2,3,4,5,6]),
“kernels”: {0:[3,5,7],1:[3,5,7] … n:[3,5,7]}, where n is number of “layers” - so it should be different length depending on number of “layers”

Is it possible to achieve?

I tried many things, for instance setting both config and hyperparam_mutations as follows:

hp = {
“layers” : tune.choice([2,3,4,5]),
“kernels” : lambda spec : { il : tune.choice([3,5,7]) for il in range(spec.config[‘layers’]) } }

in create_model(hp) I call it as

for layer in range(hp[‘layers’]):
x=Conv1D(64,kernel_size=hp[‘kernels’][layer], … )(x)

in this particular case the I get an error: TypeError: ‘function’ object is not subscriptable.

I would appreciate any clue on how I can approach this.


@rliaw @kai could you take a look?

I will appreciate it if someone who knows could at least tell if it is possible or not. Thanks!

Below I provide another example for the same problem:
Suppose you have two optimizers Adam and SGD, for the first you have just ‘lr’ hyper whereas for SGD you have ‘lr’ and ‘momentum’, so when Adam is chosen you want to omit optimization for ‘momentum’. Do you think this is possible?

Hi @eeakimov, for these kinds of problems you would usually just want to sample the optional parameters in the search space and ignore them in cases where you don’t need them. So e.g.

config = {
    "num_layers": tune.choice([2, 3, 4]),
    "kernel_layer_1": tune.choice([3, 5, 7]),
    "kernel_layer_2": tune.choice([3, 5, 7]),
    # ...

same for optimizers

config = {
    "optimizer": tune.choice(["adam", "sgd"]),
    "lr": tune.qloguniform(1e-4, 1e-1, 1e-3),
    "momentum": tune.uniform(0.7, 1.0)

And then just ignore momentum in your trainable when adam is used.

These search space configurations should work with PBT hyperparameter mutations out of the box. CC @amogkam to confirm.

Note however that adding (sometimes unnecessary) hyperparameters can slow down convergence. I think it should be possible to use the conditional search spaces with PBT, but again cc @amogkam who might be able to shed more light onto this.

1 Like

Thanks a lot for the answer, that is pretty much how I approach it now.