What's the recommended way to get intermediate model predictions out of Ray Tune?

AdamDivekar · June 4, 2023, 9:52am

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

I use tune for two things:

Tuning models (Validation-set and K-fold approach)
Running N models to measure mean and std performance.

In the 2nd case, I want to see the model predictions from all the models, not just the test metric.

How do I fetch this?
The Trainable framework supports step and save_checkpoint, but neither really fit the use-case. I calculate metrics after each step, and also on the final step

AdamDivekar · June 4, 2023, 9:57am

At the moment, I’ve resorted to some weird ideas like compressing the predictions DataFrame into a string and calling that a “metric”, but this honestly seems like a broken approach.

justinvyu · June 5, 2023, 8:46pm

You’re correct, a “metric” should be a small, lightweight summary value of your training – something like training/validation loss.

You can save these as generic trial artifacts by writing to the current working directory in a Tune Trainable.

For example:

from ray import tune

class MyTrainable(tune.Trainable):
    def step(self):
        # do training, then generate predictions
        # predictions = ...

        iteration = self.training_iteration
        with open(f"./predictions_for_iter={iteration}.pt", "w") as f:
            # Dump your pandas DF or save in whatever way you want!
            # torch.save(predictions, f)

tuner = tune.Tuner(MyTrainable)
results = tuner.fit()

for result in results:
    # All your artifacts are saved relative to the trial directory
    print("Trial directory:", result.log_dir) # On ray<=2.4
    print(os.listdir(result.log_dir))  # All of your artifacts should be here
    # print("Trial directory:", result.path) # On ray>2.4 (on nightly as of 6/5)

This user guide section may also be helpful. It goes over how to get data out of tune in the form of an AIR checkpoint: Getting Data in and out of Tune — Ray 2.4.0

Topic		Replies	Views
Accessing Tune Trials Intermediate Results by Iteration	1	348	July 25, 2023
How to access the intermediate experiment result? Ray Tune	2	525	February 26, 2021
Trouble with some results from Ray Tune	1	41	August 7, 2024
[ray tune] access the Tuner.tune_config.metric in a trial	0	190	July 14, 2023
Ray.tune - Best practices for reading datasets Ray Tune	1	549	February 18, 2022

What's the recommended way to get intermediate model predictions out of Ray Tune?

Related topics