Ray Tune: Get mean of all progress.csv files of certain arrangement

I’m using ray.tune.run to compare the performance of several different arrangements of options. This works for me, and I’m able to see the final metrics in tune_analysis.csv, which is good. But each trial also has a progress.csvfile with interesting information.

Here’s what I want: After ray.tune.run, I want to take all the progress.csv files from each of the trials. Say that for each arrangement of options, there are 10 samples, i.e. 10 trials. I want to take the 10 progress.csv files from the 10 trials, and make an average CSV file, containing the average value for each cell in the 10 files.

While I could write code that finds these progress.csv files by browsing the directory tree, I wonder if Ray Tune provides a better way?

Thanks,
Ram.

Hi @cool-RR,

This is possible with the ResultGrid API. This guide shows you how to access the history of reported metrics (which are stored inside the progress.csv file) for each trial as a pandas dataframe, which should allow you to take the mean/perform the operations you’re looking for.: Analyzing Tune Experiment Results — Ray 3.0.0.dev0.

It looks like you’re using the old tune.run API, which outputs an ExperimentAnalysis instead. You can wrap it with ResultGrid to follow along with the guide.

analysis = tune.run(...)
result_grid =  ResultGrid(analysis)

If you want to look at results from an experiment that has already finished, you can access the results with the following API:

experiment_directory = "path/to/your/experiment"
restored_tuner = Tuner.restore(experiment_directory)
result_grid = restored_tuner.get_results()

Thank you Justin. I see that you linked to pages from the Ray 3 documentation, and at least one of them doesn’t have a parallel in the Ray 2.1.0 documentation. Is this functionality fully available in Ray 2.1.0? I’m not quite ready to upgrade yet.

Hi @cool-RR,

Yes, the guide was added more recently, but all of the ResultGrid functionality is there in 2.1.0.

Also, the Ray 3.0.0dev is just the nightly branch name - we’re still on version 2.x! Upgrading to the latest 2.2 shouldn’t require many changes if any.

I understand. Thank you for your help.

Another question about this: My Tune run takes several days. How can I get partial results for my ResultGrid as the program is still running?

You can just use the code above on the experiment directory:

experiment_directory = "path/to/your/experiment"
restored_tuner = Tuner.restore(experiment_directory)
result_grid = restored_tuner.get_results()