Calling Ray Serve from Matlab

jharaldson · February 18, 2022, 9:49am

We have simulations running in Matlab and would like to serve predictions using Ray Serve. The simulations are triggered from python code using Matlab Engine API for Python, which allows simulations to be executed in parallel using Ray. Function calls from Python to Matlab code using the Matlab Engine can return data structures, but in this case we would like to make queries to Ray Serve from within the simulations and then continue to run the next step of the simulation. In the case of a python simulator we could pass the Serve handle to the simulation function and the serving would be done seamlessly. For Matlab simulators (“external environments”) the main procedure to serve predictions would be from HTTP requests. However, for short simulations (sub-second) HTTP requests add too much overhead to the simulation time.

Are there any alternative solutions to HTTP requests? Seems Apache Arrow could be an alternative, but support is currently limited in Matlab. Perhaps we could do a “zero copy” operation between the matlab process and a Serve “Router Actor”? Using the Matlab engine API is also a limiting factor for what functionality that is supported. Any ideas around this is appreciated.

shrekris · February 18, 2022, 8:08pm

To clarify, is this your setup:

Python Serve deployments that call Matlab simulation code
The Matlab simulation code then makes calls to other Serve deployments

Is the problem that there’s no clear way to call Python code from Matlab in step 2?

jharaldson · February 18, 2022, 8:57pm

Initiate a Python Serve deployment.
Python actors executes Matlab simulation code.
Matlab simulation code then calls Python serve deployment.

Yes, you could say that the problem is that there is no clear way to call Python code from Matlab. Like the way you can pass the serve deployment handle to a function in Python, leveraging the communication of Ray for passing values between machines in the cluster. Triggering a million http requests on a Kubernetes cluster does not seem to be sustainable.

simon-mo · February 18, 2022, 11:46pm

there is no clear way to call Python code from Matlab

I’m not too familiar with matlab but if this cannot be done then you can’t use serve handle to proceed here.

Triggering million of requests should be fine as long as the concurrent queries per second is under control. A single Serve HTTP Server can handle 1.5k queries per second and it can be horizontally scaled up.

jharaldson · February 19, 2022, 12:54am

Would need to do proper profiling, but when running similar simulations written in Python and passing the Serve handle to do prediction queries the total simulation time is 10-20% of the total simulation time when doing the same prediction querier using HTTP requests. Seems there is additional overhead that adds substantially to the total simulation time.
Are there any alternatives to HTTP requests supported? Possibly one can do a custom solution, but would be nice to leverage existing Ray components.

simon-mo · February 19, 2022, 12:56am

We do support hosting any web server (http, tcp, grpc, arrow flight) within the serve replica to handle these! After all, you can run arbitrary Python code, including web server hosting.

jharaldson · February 19, 2022, 12:59am

Thanks for quick answer, will have a look at hosting alternative web servers to see how that impacts total simulation time.

Topic		Replies	Views
Concurrently Processing Requests w/ Ray Serve Ray Serve	1	576	April 6, 2023
Ray Serve: custom resource optimization Ray Serve	3	352	January 26, 2023
Ray + Fast API Performance Issues	0	356	April 9, 2022
Long Running Aynch Job Ray Serve	4	783	March 29, 2022
Optimal way to handle for loop with multiple await calls Ray Serve	6	749	June 22, 2022

Calling Ray Serve from Matlab

Related Topics