Set different HTTP port for different deployments

kuku · February 11, 2025, 4:28am

I’m using Ray Serve and have two separate deployments.

In the first deployment, I start Ray Serve with:

serve.start(http_options={"host": "0.0.0.0", "port": os.environ.get("HTTP_PORT")})

In the second deployment, which has a different application name, I use:

serve.start(http_options={"host": "0.0.0.0", "port": os.environ.get("HTTP_PORT_2")})

However, in the second deployment, HTTP_PORT_2 is being ignored.

How can I set a different port for the second deployment?

christina · February 11, 2025, 8:28pm

Hi kuku! Welcome to the Ray community

Ray Serve only allows one HTTP server per Ray cluster. When you call serve.start() a second time with a different port, it does not create a new HTTP server—it simply connects to the existing Ray Serve instance, which is already using the first HTTP port.

So there’s a few different ways you can get around resolving this.

Option 1: Run Each Deployment on a Separate Ray Cluster

Since HTTP configuration is cluster-scoped, you need to run each application in a separate Ray cluster to have different HTTP ports. Example:

# Start first Ray cluster
ray start --head --port=6379

# Deploy first application
serve.start(http_options={"host": "0.0.0.0", "port": os.environ.get("HTTP_PORT")})

# Start second Ray cluster on a different port
ray start --head --port=6380

# Deploy second application
serve.start(http_options={"host": "0.0.0.0", "port": os.environ.get("HTTP_PORT_2")})

Each deployment runs independently on its own Ray cluster, allowing different ports.

Option 2: Use Ray Serve Multi-Application Support

If running multiple clusters is not feasible, you can deploy multiple applications on the same Serve instance using Serve Deployments.

Define multiple apps in a Serve config YAML
Deploy it (you can read our docs to see how to do this specifically, I will link it below)
Instead of different ports, each application gets a different route name (e.g., /app1 and /app2).

Option 3: Reverse Proxy

If you must use the same Ray cluster, but different external ports, you can use a reverse proxy like NGINX to map requests to different Serve applications.

Here’s some of the docs:
Docs:

kuku · February 11, 2025, 11:29pm

Thanks, Christina.

I’m currently using option 2, where I set a different application name and route_prefix in serve.run().

My current setup:

In file1.py, I have:
- ray.init(address="auto", ....) # I also set some other params like logging.
- serve.start(...)
- serve.run(blocking=True)
In file2.py, the structure is similar to file1.py.

In the terminal, I run:

ray --head --port=$PORT
python file1.py  # If I set serve.run(blocking=True), should I run this in the background?
python file2.py

I tend to confuse with ray.init(), serve.start and serve.run(), what is the better deployment workflow for my case?

christina · February 12, 2025, 4:11am

I can try to explain what the different functions do.

ray.init(): This is basically letting your script know it needs to connect to an existing Ray cluster. If you don’t provide specific details, it’ll try to start a local Ray cluster. This is necessary before you can use any Ray functionalities, including Ray Serve.
serve.start(): This kicks off Ray Serve in your cluster. It reads your HTTP options (but in your case, since it’s multi-app mode, it cares more about route prefixes). You only need to call this once per cluster session.
serve.run(): This is where you actually set your deployments live, using any configurations you’ve set up, like your app names and routes. If you set blocking=True , the function will block the terminal, which is useful for development and debugging as it streams logs to the console. However, for running multiple applications or scripts, you might want to run it in a non-blocking mode or in the background.

There’s a few deployment workflows too.

Single Application: If you are running a single application, you can use serve.run() with blocking=True to keep the terminal open for logs and debugging.
Multiple Applications: Since you have multiple scripts (file1.py and file2.py ), you should consider running serve.run() in a non-blocking mode or in the background. This can be done by using & in the terminal to run the command in the background or by setting blocking=False if you are using a script. (By running the scripts in the background, you can manage multiple applications more effectively.) Ensure that each application has a unique route_prefix to avoid conflicts.

Essentially, you can try starting each Python script using a non-blocking approach if they need to run concurrently. If you don’t want to use blocking=True, you could devise a way to keep the process running after deployment without blocking the terminal, with proper process management.

Topic		Replies	Views
Change ray serve port number Ray Serve	2	93	April 7, 2025
Multiple Serve instances on a ray cluster with serve REST API Ray Serve	3	639	November 30, 2022
Can I deploy services to other machines in the cluster? Ray Serve	7	586	March 23, 2022
Error when trying to get handle to Ray Serve deployment Ray Serve	2	1055	February 15, 2022
How to run multiple deployments in ray serve 2.0 Ray Serve	10	2429	December 13, 2022

Set different HTTP port for different deployments

Option 1: Run Each Deployment on a Separate Ray Cluster

Option 2: Use Ray Serve Multi-Application Support

Option 3: Reverse Proxy

Related topics