Set different HTTP port for different deployments

:slight_smile: I can try to explain what the different functions do.

  1. ray.init(): This is basically letting your script know it needs to connect to an existing Ray cluster. If you don’t provide specific details, it’ll try to start a local Ray cluster. This is necessary before you can use any Ray functionalities, including Ray Serve.
  2. serve.start(): This kicks off Ray Serve in your cluster. It reads your HTTP options (but in your case, since it’s multi-app mode, it cares more about route prefixes). You only need to call this once per cluster session.
  3. serve.run(): This is where you actually set your deployments live, using any configurations you’ve set up, like your app names and routes. If you set blocking=True , the function will block the terminal, which is useful for development and debugging as it streams logs to the console. However, for running multiple applications or scripts, you might want to run it in a non-blocking mode or in the background.

There’s a few deployment workflows too.

  • Single Application: If you are running a single application, you can use serve.run() with blocking=True to keep the terminal open for logs and debugging.
  • Multiple Applications: Since you have multiple scripts (file1.py and file2.py ), you should consider running serve.run() in a non-blocking mode or in the background. This can be done by using & in the terminal to run the command in the background or by setting blocking=False if you are using a script. (By running the scripts in the background, you can manage multiple applications more effectively.) Ensure that each application has a unique route_prefix to avoid conflicts.

Essentially, you can try starting each Python script using a non-blocking approach if they need to run concurrently. If you don’t want to use blocking=True, you could devise a way to keep the process running after deployment without blocking the terminal, with proper process management.