What is the recommended and efficient way to make the best use of all ray features?

I am a beginner in the ray library, and I would like to make the most of ray in the project I have been working on. To put it simply, I have been working on a project where I train time series data with PyTorch and tune hyperparameters. There seem to be various ways to apply ray to this project, which is confusing. For example, even for just hyperparameter tuning, there are the following approaches:

  1. Using train_loop_per_worker, ray.train.torch.TorchTrainer, and ray.tune.Tuner.
  2. Putting the function directly into ray.tune.Tuner (Trainable Function API).
  3. Creating a class that inherits from ray.tune.Trainable and putting it into ray.tune.Tuner (Trainable Class API).

I want to use the most recommended and effective approach in ray. I initially chose the first approach, then switched to the second approach for some reasons, and then switched to the third approach. However, it seems that I cannot use concepts like TorchPredictor, so I am considering whether I should switch back to the first approach.

To summarize, I have two main questions for now:

  1. What is the recommended and efficient way to make the best use of all ray features?
  2. Is it not possible to use ray’s features with the ray.tune.Trainable Class API?

Additionally, I would appreciate any helpful articles or insights you can provide. Thank you for reading.

@seokjin1013 Hello, and welcome to the Ray community. That’s a good and a very broad questions, because Ray is a general-purpose framework to do distributed compute. For that it provides primitive abstracts such as Tasks and Actors to implement your distributed functions (Tasks) and state-based distributed services (Actors).

Tuner always takes an objective function (which can be a training function or class which will iterate over your data with different hyperparameters.
The train_loop_per_worker is the function you would provide to a PyTorch instance of TorchTrainer. This is your so called objective function. Now if you also want to tune your PyTorch model too, then you create a Trainer class, and supply that to the Tuner as your objective function.

So the idea is

  1. Use Tune alone and supply your objective function
  2. Use Trainer and Tune together. Define a Trainer, supply the trainin funciton, which could be your train_loop_per_worker by convention, and then use that Trainer as your objective function to Tuner.
  1. What is the recommended and efficient way to make the best use of all ray features?

That is a very broad question. It depends on your use cases and workloads. And each workload, say inference would require you to use Ray Data; Serving models would require use of Ray Serve etc.

  1. Is it not possible to use ray’s features with the ray.tune.Trainable Class API?

Not sure what you exactly mean here, but Trainable is an abstract class which you can derive to implement more control how training is conducted. But if you want Ray to use its default behaviour, supply your training function or objective function.

We provide

wrappers for for Trainers. For these you must provide a training function train_loop_per_worker across many ML integrated libraries.

I hope that helps.

cc: @justinvyu @bveeramani Want to add any more clarification here?

Use Tune alone and supply an objective function

I train time series data with PyTorch and tune hyperparameters.

@seokjin1013 For this use case, I’d recommend using TorchTrainer with Tuner. I think it’d be easier than using Ray Tune alone, especially if you’re performing distributed training.