Ray Tune PBT/PBT2 with Transformers?

BramVanroy · June 10, 2022, 5:23pm

I would like to get started with PBT/PBT2 after seeing this presentation. I want to use it as part of Hugging Face Transformers’ Trainer. Some collaboration has happened between Ray and HF, and it is neat that some hyperparameter_search is implemented. There’s even a handy blog post!

In the blog post, an example is given. But looking through the source code of the implementation in Transformers, it is not clear to me which types of schedulers are supported. Can you give an example of how to use PopulationBasedTraining (and/or bandits) with the Transformers Trainer?

amogkam · June 14, 2022, 4:55pm

Hey @BramVanroy! All schedulers should be supported when doing transformers.hyperparameter_search!

For an example using PBT with HF transformers, you can take a look at this example.

BramVanroy · June 18, 2022, 7:38pm

Thank you for your reply, @amogkam!

I have two more questions, specifically with PB2 in mind. Looking at the example that you posted, I see one search space (tune_config) and then the search space in the scheduler itself. How do those work together? How is hp_space used alongside the optimization that goes on in the scheduler?

And second, after hyperparameter_search, how do we load the best/final model into the trainer’s model so that we can use it for inference or testing? Previously with just grid search I used to do it like so:

best_params = trainer.hyperparameter_search(...)

# Set the trainer to the best hyperparameters found
for hparam, v in best_params.hyperparameters.items():
    setattr(trainer.args, hparam, v)

# Save optimal hparams
with output_dir.joinpath("opt_hparams.json").open("w", encoding="utf-8") as hp_out:
    dump(best_params, hp_out, indent=4, sort_keys=True)

# Now train the model from-scratch with the best hparams
train_result = trainer.train()
# ... and then get predictions from the model 
predictions = trainer.predict()

But because PBT/PB2 work differently, I am not sure how to continue from here. Any help is appreciated!

BramVanroy · June 29, 2022, 10:44pm

Hi @amogkam. Apologies for bumping. Do you have any answer to my two questions above? Thanks in advance!

Topic		Replies	Views
Bert population based hyperparameter tuning with huggingface and raytune Ray Tune	2	656	August 1, 2022
Ray Tune PBT - Structural Hyperparameters Ray Tune	1	20	November 15, 2024
[Tune PBT] Population Based Training :: Questions & Errors Ray Tune	3	1190	April 1, 2021
How to obtain the best schedule of hyperparameters from Population Based Training? RLlib	0	172	August 24, 2023
[PBT] Retrieve the best hyperparameter config from PBT training Ray Tune	1	296	June 2, 2021

Ray Tune PBT/PBT2 with Transformers?

Related topics