Map DeploymentResponse to Dataset

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

How can I map a DeploymentHandle.remote() call to a Dataset? I’ve tried several different things. Here is a toy example of what I am trying to do:

import ray
from ray import serve
from ray.serve.handle import DeploymentHandle, DeploymentResponse

import numpy as np
import asyncio
import requests


@serve.deployment
class ToyModel:
    def __init__(self):
        pass

    async def __call__(self, request):
        await asyncio.sleep(3)
        return np.random.random(256, 256, 3)


@serve.deployment()
class ToyProducer:
    def __init__(self, handle: DeploymentHandle):
        self.handle = handle

    async def __call__(self, request):
        ds = ray.data.from_numpy(np.random.random((3, 256, 256, 3)))

        def call_toy_model(block):
            block["pred"] = self.handle.remote(block["data"])
            return block

        ds = ds.map(call_toy_model)

        return ds.materialize()


handle = ToyModel.bind()
producer = ToyProducer.bind(handle)

ray.init()
serve.run(producer)

# Call the producer to trigger the pipeline
ret = requests.get("http://localhost:8000/")

print(ret.content.decode("utf-8"))

I’m not too familiar with Ray Datasets, but does this work if you get the result of the self.handle.remote(block["data"]) remote call before passing it into block? Or is the expected behavior that you can return an ObjectRef and Ray Data will resolve it for you?

If that’s the case, then you may need to convert the result of self.handle.remote(block["data"]) to a Ray ObjectRef. You can use the _to_object_ref developer API to do so.

I’m not familiar with the _to_object_ref api. This looks like it might help though. Thank you! I will let you know if this solves the issue.

1 Like