How severe does this issue affect your experience of using Ray?
- Low: It annoys or frustrates me for a moment.
Dear community,
I have a conceptual question about using Ray for distributed hyperparameter optimization with Bayesian search for any kind of estimator e.g. a Support Vector Machine for binary classification.
Following the examples and the explanation in the documentation on the github page, ray.tune.search.bayesopt
seems to be the search algorithm I need. However, I don’t really understand how the parallelization works here.
The objective I want to maximize is the accuracy of the SVM classifier. The bayesopt
proposes hyper parameter trials that are used for the training and then evaluated with the scoring function of the SVM. The bayesopt
picks per default 10 initial points as a kind of warm-up to fit a gaussian process (GP) to the functional relationship between score and hyperparameters. After that, trials are chosen based on this GP.
My question now is: How many trials are reported in every iteration by the GP before it is updated? If it is only 1, this would mean the execution is almost serial after warm-up because we get 1 trial at a time and only update the GP after we got the score for that trial to pick the next trial. If it is > 1, e.g. 5, we could train the model on these 5 trials in parallel.
Does someone know what is happening there behind the scenes?
Kind regards,
Jan