Hello everyone
I am new to Ray.
I have followed the steps to configure a Ray cluster on AWS.
My challenge now is I have an api built using Fastapi. an endpoint from the api collects request from user to create inference by a machine learning model.
I want to use Ray for distributing workloads for this ML model on multiple nodes. This will increase speed of creating inference for large input data.
I want to be able to deploy the Fastapi app (along with the ML model service) somewhere, and when request come in for inference:
- In the Fastapi app, I initialize Ray with ray.init and connect to the remote ray cluster.
- Call the function that creates the inference using Ray’s .remote()
- The Ray remote cluster runs the workload in distributed fashion using the head node and worker nodes created on AWS.
- Fastapi awaits the futures using Ray’s .get(), once ready, Ray sends the result back and Fastapi respond to the user.
Is this possible with Ray and Fastapi?