Parallelize TorchTrainer + Preprocessor + Training?

localh · October 25, 2023, 1:53am

Is there an example available for using TorchTrainer + Preprocessor in parallel? In my setup, ray is preprocessing everything first, and then training. Is there an example that shows how to make it train + preprocess batches on-the-fly?

justinvyu · October 27, 2023, 5:35pm

@localh Ray Data is now doing streaming execution by default. (Try upgrading to 2.7 if you’re on an older version of Ray.)

This means that batch fetching from s3 + preprocessing + feeding into training happens in a streaming fashion.

The only way to do “preprocessing everything first” is to call ds.materialize at some point before your training starts. See here for more info: Ray Data Internals — Ray 2.7.1

Topic		Replies	Views
Ray Data and Train connection options	0	94	April 12, 2024
[High] Why doesn't parallelism work with data preprocessing? Ray Serve	14	598	December 28, 2023
Is it correct for this sample code? Ray Train	1	328	September 25, 2023
Correctly sizing preprocessing Actor in Ray data Ray Data	3	73	June 26, 2024
Distributed torch model training with Ray Core APIs Ray Core	3	501	November 3, 2023

Parallelize TorchTrainer + Preprocessor + Training?

Related topics