Specifying resources using Ray Serve

vadimkantorov · May 19, 2025, 9:41am

Hi! I’m a Ray newbie.

I’m looking for recommendations of how to build a very simple pipeline with Ray (not model serving per se, but somewhat similar):

a single node
node has 128 vCPUs
I’d like to build a HTTP API which processes the CPU-heavy requests (so in total I’d like to specify maximum number of simultaneous workers as 256 - to max out CPU usage) with a function (wrapped/parallelized as Ray actor)

Is it possible to do just with Ray Serve (without fastapi/uvicorn; relying on Ray for controlling the max simultaneous parallelism)? How can I specify the Ray resources using Ray Serve? (e.g. I’d like to specify that actor takes up to 0.5 cpu resource, and in total there is 128 of cpu resource; I found Resource Allocation — Ray 2.46.0, but it seems lacking a complete example) Is there anywhere a complete code example of such basic pattern?

Thanks!

E.g. one can think of application like GitHub - project-numina/kimina-lean-server: Kimina Lean server which is a server which accepts HTTP requests for verification of programs the in Lean programming language. This requires having a pool of workers each running an instance of Lean compiler (which may crash occasionally). Would Ray / Ray Serve be a good framework for solving such a task?

1. Severity of the issue: (select one)
None: I’m just curious or want clarification.

Akshay_Malik · May 19, 2025, 7:57pm

Hi, thanks for your interest in Ray Serve! Yes, its certainly possible to build this with Ray Serve. You can take a look at this example where the resources are specified per deployment - Serve a Stable Diffusion Model — Ray 2.46.0

Topic		Replies	Views
Resources used by HTTPProxyActor Ray Serve	5	1132	February 16, 2021
Actor placement and execution resources Ray Core	8	365	December 12, 2023
Best way to config ray workers Ray Core	6	451	February 26, 2021
Using ray serve for video pipeline Ray Serve	1	478	June 5, 2023
Resource utilization for RayServe in Kubernetes (AKS) Kubernetes	4	532	June 24, 2022

Specifying resources using Ray Serve

Related topics