How to share memory between 2 replicas in Ray Serve

Hi all,

I just start to learn Ray and Ray Serve. I have a question that: How to share memory between 2 replicas in Ray Serve ? I have the following code

class Test:

    def __init__(self):
        self.__ocr = np.zeros((10000, 10000))

    def __call__(self, request: Request):

        print("id:" + str(id(self.__ocr)))
        return {"results": ""}

client = serve.start(http_host="", http_port=8866)

backend_config = serve.BackendConfig(num_replicas=2, max_concurrent_queries=5)

client.create_backend("test_backend", Test, config=backend_config)
client.create_endpoint("test_endpoint", backend="test_backend",
                       route="/test", methods=["POST"])

When I request to the endpoint /test twice, the output shows

(pid=22245) id:139755359920576
(pid=22249) id:140652686941200

So it seems they are not sharing the same memory for the variable self.__ocr.

When I change the code and use the ray.put in the Class Test.

class Test:

    def __init__(self):
        self.__ocr = ray.put(np.zeros((10000, 10000)))

    def __call__(self, request: Request):

        print("id:" + str(self.__ocr))
        return {"results": ""}

It will be


(pid=22920) id:ObjectRef(a0c4d0d97e6934502151b6ab9585e9d201000000)
(pid=22915) id:ObjectRef(6f4e53378512deb8bf3df443d39b792201000000)

It seems 2 replicas are still not sharing the same memory for the variable self.__ocr as well.

The provided code snippet is running on my laptop. I set num_replicas=2 in the setting serve.BackendConfig and want to share memory of objects between these 2 replicas in Ray Serve. I know Ray could achieve the goal simply in the following code from

import numpy as np
import ray
def func(array, param):
    # Do stuff.
    return 1
array = np.ones(10**6)
# Store the array in the shared memory object store once
# so it is not copied multiple times.
array_id = ray.put(array)
result_ids = [func.remote(array_id, i) for i in range(4)]
output = ray.get(result_ids)

But how do we get it in the Ray Serve ? I believe this question may not be complicated , and just because I start to learn Ray and Ray Serve and don’t know the answer. I just hope Ray Serve could do the Zero-Copy read for a variable between 2 replicas on one node.

I checked the Ray Serve document but it didn’t mention which is the correct way to share memory between 2 replicas in Ray Serve. I post it here for the help.

Thank you !

Hi @zhfkt, currently you’re creating a new object on instantiation of the class and push it to the object store. Instead, your class should accept a reference to the object as a parameter. E.g. something like this

class Test:
    def __init__(self, ocr):
        self._ocr = ocr

    def __call__(self, request):
        print("id" + str(self._ocr))
        return {"results": ""}

ocr_data = ray.put(np.zeros((10000, 10000)))
client.create_backend("test_backend", Test, ocr_data, config=backend_config)


Here you would first create your data object, push it to the object store, and then instantiate each replica with a copy of the reference to the object.

Does this help?


Yes. Thank you . Let me check !