[High] How to raise a second grpc replica on the same server?

psydok · March 16, 2023, 6:48pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

I created a grpc application via ray, but I noticed a bug due to which batching of messages after load testing blocks the method that is responsible for batching requests until the service is restarted.
I could not figure out what was wrong, so I decided to just run the second replica.
On the server 2 gpu,
I wanted to run 1 grpc service on each. But I can’t.

@serve.deployment(
    is_driver_deployment=True,
    name="whisper",
    ray_actor_options={"num_cpus": 12},
    num_replicas=2
)
class MyService(test_pb2_grpc.MyServiceServicer, gRPCIngress):
   ...

    @serve.batch(max_batch_size=1, batch_wait_timeout_s=0.150)
    async def handle_batch(
        self, requests: RequestBatch
    ):
    return ["ok" for i in requests]

    async def Method(self, request, context: grpc.ServicerContext):
        return await self.handle_batch(request)

I run it like this:

FROM rayproject/ray-ml:2.3.0-py38-gpu
....
CMD ldconfig \
    && OMP_NUM_THREADS=${OMP_NUM_THREADS} ray start --head \
    --port=$RAY_HEAD_PORT \
    # --include-dashboard=false \
    --dashboard-host=0.0.0.0 \
    --dashboard-port=$RAY_DASHBOARD_PORT \
    && serve run main:stt_deployment

gjoliver · March 20, 2023, 5:09pm

what’s the error message you see?
the question is a bit confusing, are you trying to run 2 gRPC applications on a single cluster, or you are trying to run 2 clusters, each with a gRPC application?

psydok · March 20, 2023, 9:25pm

I’m trying to run 2 gRPC applications on the single cluster using num_replicas.
But nothing comes out. I had to raise 2 head ray clusters (ray start --head && serve run ... in two docker-containers) with different dashboards in order to deploy 2 replicas of the same application.

Topic		Replies	Views
[High] [Ray Serve] Run gRPC services in one cluster Ray Serve	7	1276	December 7, 2022
Two grpc servers in same RAY	0	242	December 17, 2023
Ray serve deployment is not scaling up, ongoing request is always 0 Ray Serve	1	315	April 18, 2024
[Ray Serve] using GRPC and DAG to host multiple models(or actors) in the same deployment	3	425	February 2, 2023
Why there is no possibility to call more than 100 requests in parallel to Ray Serve? Ray Serve	4	256	January 10, 2024

[High] How to raise a second grpc replica on the same server?

Related topics