Multiple async actors vs single Actor / plain asyncio

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hello, first of all thank you for this great framework.
I would like to understand whether having multiple async Actors that process elements from a list vs having a single actor OR using plain async/await for that provides any real performance improvement, given that spawning more processes does not bring any value with event loop and coroutines.

My initial guess was that having multiple actors could help with achieving something close to how Rust’s Tokio works like, but I might be 100% wrong.

Good question. I did some experiments but it was more about using async Actors vs sync actors vs threading. Not sure if that addresses this question.

cc: @cade

1 Like

@Jules_Damji Thank you for your response. Actually, I had already found your article and I thought it was really interesting, my point though was more focused on understanding whether to rewrite some async/await code by using 1 or more Async Actors; I’m trying to understand which gain I might obtain, besides the builtin maximum concurrency handling (which is really nice, btw).

The code I have is all I/O bound code (network requests), and I am running it on a single multi-core Ubuntu server. Open to all kinds of suggestions from you all!

Hi @ingandreaguidi! This use-case feels well-served by a single Ray asyncio actor.

It’s hard to answer the root question in a general way; under high-load the single event loop will have a limit in throughput and/or latency in processing tasks. Once your event loop is saturated (can no longer increase throughput), or the latency variation is unacceptably high, you should move to multiple asyncio actors and shard the input data over them to better parallelize the processing. This should increase your throughput linearly (by the number of actors), and will then be limited by the number of cores in your CPU.

If you want high confidence, I would run a load test against a single async actor to determine the maximum throughput a single actor can sustain with reasonable latency.

Excellent @cade. yes, I would agree for I/O bound, it seems like async is the best choice.

@ingandreaguidi, let us know how you fare with @cade’s suggestion.