Ray Tune: Get mean of all progress.csv files of certain arrangement

cool-RR · January 7, 2023, 8:12pm

I’m using ray.tune.run to compare the performance of several different arrangements of options. This works for me, and I’m able to see the final metrics in tune_analysis.csv, which is good. But each trial also has a progress.csvfile with interesting information.

Here’s what I want: After ray.tune.run, I want to take all the progress.csv files from each of the trials. Say that for each arrangement of options, there are 10 samples, i.e. 10 trials. I want to take the 10 progress.csv files from the 10 trials, and make an average CSV file, containing the average value for each cell in the 10 files.

While I could write code that finds these progress.csv files by browsing the directory tree, I wonder if Ray Tune provides a better way?

Thanks,
Ram.

justinvyu · January 10, 2023, 8:25pm

Hi @cool-RR,

This is possible with the ResultGrid API. This guide shows you how to access the history of reported metrics (which are stored inside the progress.csv file) for each trial as a pandas dataframe, which should allow you to take the mean/perform the operations you’re looking for.: Analyzing Tune Experiment Results — Ray 3.0.0.dev0.

It looks like you’re using the old tune.run API, which outputs an ExperimentAnalysis instead. You can wrap it with ResultGrid to follow along with the guide.

analysis = tune.run(...)
result_grid =  ResultGrid(analysis)

If you want to look at results from an experiment that has already finished, you can access the results with the following API:

experiment_directory = "path/to/your/experiment"
restored_tuner = Tuner.restore(experiment_directory)
result_grid = restored_tuner.get_results()

cool-RR · January 13, 2023, 8:10pm

Thank you Justin. I see that you linked to pages from the Ray 3 documentation, and at least one of them doesn’t have a parallel in the Ray 2.1.0 documentation. Is this functionality fully available in Ray 2.1.0? I’m not quite ready to upgrade yet.

justinvyu · January 18, 2023, 6:28pm

Hi @cool-RR,

Yes, the guide was added more recently, but all of the ResultGrid functionality is there in 2.1.0.

Also, the Ray 3.0.0dev is just the nightly branch name - we’re still on version 2.x! Upgrading to the latest 2.2 shouldn’t require many changes if any.

cool-RR · January 18, 2023, 6:41pm

I understand. Thank you for your help.

cool-RR · January 26, 2023, 11:52am

Another question about this: My Tune run takes several days. How can I get partial results for my ResultGrid as the program is still running?

kai · January 30, 2023, 7:29pm

You can just use the code above on the experiment directory:

experiment_directory = "path/to/your/experiment"
restored_tuner = Tuner.restore(experiment_directory)
result_grid = restored_tuner.get_results()

cool-RR · January 31, 2023, 4:41pm

Awesome, thank you Kai!

Topic		Replies	Views
Getting analysis results from Tune before it's finished Ray Tune	8	299	April 11, 2023
Ray Tune confidence interval Ray Tune	1	406	July 27, 2021
Add trials to experiment for later analysis	3	319	July 5, 2023
Accessing Tune Trials Intermediate Results by Iteration	1	351	July 25, 2023
Ray Tune Table location Debugging and performance tuning	1	413	December 20, 2022

Ray Tune: Get mean of all progress.csv files of certain arrangement

Related topics