Are the hyperparameters searched using reliable? Why do two runs give different results?

Are the hyperparameters searched using ray reliable? Why do two runs give different results? The only difference between the two runs is that “batch_size”: tune.choice([4, 8]) becomes “batch_size”: tune.choice([4, 8, 16]), but the hyperparameters of the ray search are not the same

Yes. Two runs can be different because the Tuner will sample N trials(in your case N=10) from your param_space. Tune offers various functions to define search spaces and sampling methods. You can specify how these values are sampled.

For more information, please refer to this doc: Key Concepts of Ray Tune — Ray 2.4.0