Calculating single metric value for dataset

Falcon_Programmer · May 23, 2022, 11:10pm

I’ve just managed to get ray train working and I now want to track overall progress in the report.

I’m using torchmetrics to calculate an overall MSE for a dataset, how can I do this to produce a single value for a training epoch? I have to use batches for training, so the way I’ve previously done it prior to using ray was to save all outputs/labels into a list to concatenate prior to calculating the metric. e.g.

all_outputs = []
all_labels = []

for x, labels in train_loader:
	outputs = model.forward(x, True)
	outputs = torch.squeeze(outputs) # network has single output, so squeeze to match labels

	all_outputs.append(outputs.detach())
	all_labels.append(labels)

all_outputs = torch.cat(all_outputs)
all_labels = torch.cat(all_labels)

mse = torchmetrics.functional.mean_squared_error(all_outputs, all_labels).item()

I have 2 workers running, so currently when I do this I end up with two mse values in the report. Is there anyway to do this so I can get a single value for the entire dataset rather than two values?

kai · May 31, 2022, 8:35am

How are you reporting/sacing your MSE metrics?

Ray Train natively supports torchmetrics, so if you use train.report(), you should be able to get both values and be able to access them in a callback (see e.g. here: Ray Train User Guide — Ray 1.12.1). You could then aggregate them in the callback for further processing, if desired.

Topic		Replies	Views
Aggregation of distributed metrics Ray Train	1	639	March 4, 2022
How to get PyTorch losses from Ray Train? Ray Train	1	466	January 11, 2022
[Tune Class API + Pytorch] Custom metrics are not properly passed to ExperimentAnalysis and Tensorboard Ray Tune	2	386	March 29, 2021
Usage of torchmetrics for non-additive metrics Dashboard, Monitoring & Debugging	2	37	October 21, 2024
RuntimeError: Some workers returned results while others didn't. Make sure that `train.report()` and `train.checkpoint()` are called the same number of times on all workers Ray Train	1	697	April 16, 2022

Calculating single metric value for dataset

Related topics