Can I deploy services to other machines in the cluster?

I have a cluster consisting of three machines (A, B, C). I want to specify the ip of B through serve.start() on machine A, and it prompts me that the binding fails. Does the ray server support this deployment method?

@839576266 you only need to call serve.start on one machine; the replicas of your deployments will be placed across the cluster.

The http_host is what host the http server should bind to on each node (typically localhost or 0.0.0.0).

I tried to bind the host in serve.start(), but it doesn’t seem to work, is there anything else I need to configure?

import time

import ray
from ray import serve

environment_dict = {
    "working_dir": "/home/hwd/kernel"
}

ray.init(address='ray://10.3.70.138:10001', runtime_env=environment_dict)

http_dict = {"host": "10.3.70.140", "port": 9009}
serve.start(http_options=http_dict)


@serve.deployment
def hello(request):
    name = request.query_params["name"]
    return f"Hello {name}!"


# Deploy model.
info = hello.options(num_replicas=3).deploy()
while True:
    time.sleep(5) 

But script occurred exception:

Traceback (most recent call last):
  File "/tmp/pycharm_project_188/tests/hik_serving/test/ray_test.py", line 19, in <module>
    serve.start(http_options=http_dict)
  File "/usr/lib/fedai/hikflkernel/.venv/lib/python3.8/site-packages/ray/serve/api.py", line 465, in start
    ray.get(
  File "/usr/lib/fedai/hikflkernel/.venv/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
    return getattr(ray, func.__name__)(*args, **kwargs)
  File "/usr/lib/fedai/hikflkernel/.venv/lib/python3.8/site-packages/ray/util/client/api.py", line 42, in get
    return self.worker.get(vals, timeout=timeout)
  File "/usr/lib/fedai/hikflkernel/.venv/lib/python3.8/site-packages/ray/util/client/worker.py", line 359, in get
    res = self._get(to_get, op_timeout)
  File "/usr/lib/fedai/hikflkernel/.venv/lib/python3.8/site-packages/ray/util/client/worker.py", line 386, in _get
    raise err
types.RayTaskError(ValueError): ray::HTTPProxyActor.ready() (pid=27111, ip=10.3.70.138, repr=<ray.serve.http_proxy.HTTPProxyActor object at 0x7f8f8119be20>)
OSError: [Errno 99] Cannot assign requested address

During handling of the above exception, another exception occurred:

ray::HTTPProxyActor.ready() (pid=27111, ip=10.3.70.138, repr=<ray.serve.http_proxy.HTTPProxyActor object at 0x7f8f8119be20>)
  File "/usr/local/abm/python/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/local/abm/python/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/usr/lib/fedai/hikflkernel/.venv/lib/python3.8/site-packages/ray/serve/http_proxy.py", line 329, in ready
    return await done_set.pop()
  File "/usr/lib/fedai/hikflkernel/.venv/lib/python3.8/site-packages/ray/serve/http_proxy.py", line 348, in run
    raise ValueError(
ValueError: Failed to bind Ray Serve HTTP proxy to '10.3.70.140:9009'.
Please make sure your http-host and http-port are specified correctly. 

@839576266 are you sure that 10.3.70.140 is the right address to bind to on the cluster? If you try localhost or 0.0.0.0 does it work?

I have deployed a ray cluster where 10.3.70.138 is the head node and 10.3.70.140 is the work node. I am running this program on 10.3.70.138 and if I use localhost it will deploy a serve deployment on 10.3.70.138. But I want to deploy serve deployment on 10.3.70.140, I’m not sure if ray supports this

@839576266 the replicas of the deployment will run across the cluster, you don’t need to set the host for each node in the cluster individually.

What does replicas in ray serve refer to? I completed my deployment through Hello.options(num_replicas=3).deploy(), but my port can only be monitored on the head node, other machines in the cluster do not find related processes

The head node’s http proxy will load balance requests to each of the Hello replicas.