I have several problems.
- Is it possible to run multiple gRPC services in the same cluster, which I’m assuming to connect 2-3 remote servers (nodes) to?
To then knock on the service at the same address (host:port) with a different endpoint?
For example,
head = 172.198.0.2:9000
I started 2 services in a cluster connected to head with different endpoints, but on the same port. And now I would like to be able to make requests to any of them on such hosts:
172.198.0.2:9000/cluster_gpu/models1
172.198.0.2:9000/cluster_gpu/models2
172.198.0.2:9000/cluster_cpu/models3
- (for example, launched another cluster without gpu)
I tried to start 1 service on a remote machine, but I don’t understand why the service is started on the main node, from the documentation it seemed to me that the main node serves just as a router.
- Moreover, I ran into the problem that the service starts at the same time on the remote machine.
Moreover, it endlessly restarts the replicas on the current host, which I meant to dispose of, and successfully on the head node.
Now I’m running it locally via docker-compose, ray[serve]==2.1.0
My Dockerfile for head node (ex, 172.198.0.2):
FROM rayproject/ray:2.1.0-gpu
...
ENV RAY_EXPERIMENTAL_NOSET_CUDA_VISIBLE_DEVICES=0
ENTRYPOINT [\
"ray", "start",\
"--head",\
"--port=6379",\
"--redis-shard-ports=6380,6381",\
"--object-manager-port=22345",\
"--node-manager-port=22346",\
"--dashboard-host=0.0.0.0",\
"--ray-client-server-port=10001",\
"--block"]
Dockerfile for serve node (ex, 172.198.0.3, and RAY_HEAD_ADDRESS=172.198.0.3:6379
):
FROM rayproject/ray:2.1.0-gpu
...
ENV RAY_EXPERIMENTAL_NOSET_CUDA_VISIBLE_DEVICES=0
ENV RAY_HEAD_ADDRESS=${RAY_HEAD_ADDRESS}
CMD ray start --address=$RAY_HEAD_ADDRESS && \
serve run \
serve_grpc:my_deployment \
--runtime-env=runtime.yaml
serve_grpc.py
....
@serve.deployment(
is_driver_deployment=True,
name="adapter",
ray_actor_options={"num_gpus": 1},
num_replicas=1,
router_prefix="/gpu/models1"
)
class ModelsServices(models_pb2_grpc.ModelsServiceServicer, gRPCIngress):
....
my_deployment = ModelsServices.bind()
Tell me, please, how to do it right?