"await" vs "asyncio.gather" when making multiple calls to Deployment

Lewis_Bails1 · November 11, 2022, 8:40am

I’m trying to use the model composition design pattern in Ray Serve but I’m struggling to come up with the best solution when dealing with multiple calls to sub-models.

Which of these solutions would you say is more Ray-thonic (or even uses “await” / “asyncio.gather” properly)?

Using asyncio.gather:

@serve.deployment
class Model:
    ...
    def __call__(self, texts: list[str]):
        tasks = [self.predictor_handle.remote(self.tokeniser_handle.remote(t)) for t in texts]
        refs = await asyncio.gather(*tasks)
        results = await asyncio.gather(*refs)
        return results

Awaiting individual asyncio.Tasks/ray.ObjectRefs:

@serve.deployment
class Model:
    ...
    def __call__(self, texts: list[str]):    
        tasks = [self.predictor_handle.remote(self.tokeniser_handle.remote(t)) for t in texts]
        refs = [await t for t in tasks]
        results = [await r for r in refs]
        return results

I’m not very experienced with asyncio stuff so I’m not sure which would be faster at scale or have the least blocking calls. My initial tests haven’t been very conclusive. It would be great if the Ray community could help me out!

Topic		Replies	Views
Optimal way to handle for loop with multiple await calls Ray Serve	6	1020	June 22, 2022
Concurrent queries blocking following queries Ray Serve	2	575	November 22, 2021
Cancelling requests during model composition results in unresolved async tasks Ray Serve	1	26	March 27, 2025
Using asyncio to process HTTP requests concurrently Ray Serve	2	501	August 3, 2021
Workflow calling Deployment.remote()? Ray Workflows	0	362	November 28, 2023

"await" vs "asyncio.gather" when making multiple calls to Deployment

Related topics