Concurrent queries blocking following queries

danielkonzalla · October 27, 2021, 11:55am

Hi everyone,

my question is about concurrent queries in a Serve instance.

We use a setup which can be simplified to the following example. I deployed two simple functions, where the second function depends on the output of the first function:

import asyncio
from typing import Coroutine

import ray
from ray import serve


@serve.deployment()
async def composed_model(_id: int):
    first_func_h = first_func.get_handle()
    second_func_h = second_func.get_handle()
    first_res_h = first_func_h.remote(_id=_id)
    second_func_h.remote(_id=first_res_h)

@serve.deployment
async def first_func(_id):
    if _id == 0:
        await asyncio.sleep(1000)
    print(f'First output: {_id}')
    return _id

@serve.deployment
async def second_func(_id):
    while isinstance(_id, ray.ObjectRef) or isinstance(_id, Coroutine):
        _id = await _id
    print(f'Second output: {_id}')
    return _id

client = serve.start(detached=True)

composed_model.deploy()
first_func.deploy()
second_func.deploy()

main_p = composed_model.get_handle()
main_p.remote(_id=0)
main_p.remote(_id=1)

Expected output

When executing the script above, the expected output would be that both function process the second query, since we are using async code:

First output: 1
Second output: 1

Actual output

However, the second function seems to be blocked by the first query with _id=0 and the second query is only processed by the first function:

First output: 1

Workaround

Currently we are using a workaround, where we await the output of the first function:

@serve.deployment()
async def composed_model(_id: int):
    first_func_h = first_func.get_handle()
    second_func_h = second_func.get_handle()
    first_res_h = first_func_h.remote(_id=_id)
    second_func_h.remote(_id=await first_res_h)

With this workaround, we get the expected output mentioned above. However, this behavior blocks our performance, since this creates a bottleneck and we have to wait for the first function to finish, which in our case takes quite some time. In our setup we have multiple of these long running functions and the following functions would have to wait for all of them to be finished.

Another workaround would be to use the asyncio.wait() function, but we expected, that above mentioned example should already work.

Is there a reason for this behavior, or is this some kind of bug?

simon-mo · October 28, 2021, 3:28am

Hi @danielkonzalla thanks for the report! This seems like the bug. I investigated it a bit and it turns out to be a Ray Core (the system Ray Serve relies on). I filed it here: [Core] Concurrent (and async) actors should allow later invocations to execute when dependency ready · Issue #19822 · ray-project/ray · GitHub and I will keep track of it. Thank you!

simon-mo · November 22, 2021, 6:43am

Hi @danielkonzalla, the underlying Ray Core issue has been addressed after a series of PRs. I have verified that the issue doesn’t occur on ray nightly anymore Installing Ray — Ray v2.0.0.dev0. I will also add a ray serve regression test to ensure this behavior will always be correct.

Topic		Replies	Views
Using asyncio to process HTTP requests concurrently Ray Serve	2	500	August 3, 2021
Concurrently Processing Requests w/ Ray Serve Ray Serve	1	1010	April 6, 2023
Help debugging blocked serve deployment Ray Serve	1	604	March 7, 2022
"await" vs "asyncio.gather" when making multiple calls to Deployment	0	323	November 11, 2022
Cluster Tasks executed count question Ray Clusters	1	340	January 12, 2023

Concurrent queries blocking following queries

Expected output

Actual output

Workaround

Related topics