[Core] Why actors are executed sequentially?

dirtyValera · September 5, 2023, 3:33am

I have an actor:

@ray.remote
class SimulationWorkerActor:

    def __init__(self):
        self.loop = None

    async def run_loop(self, loop: Loop, split_id: int):
        self.split_id = split_id
        self.loop = loop
        start = time.time()
        print(f'Started loop for split {split_id}')
        loop.run() # blocking cpu intensive computation
        self.run_loop_time = time.time() - start
        print(f'Finished loop for split {split_id} in {self.run_loop_time}s')

And orchestration code:

actors = [SimulationWorkerActor.options(num_cpus=1).remote() for _ in range(len(self.generators))]
print(f'Inited {len(actors)} worker actors')

refs = [actors[i].run_loop.remote(
    loop=Loop(...),
    split_id=i
) for i in range(len(actors))]
print(f'Scheduled loops, waiting for finish...')

# wait for all runs to finish
ray.get(refs)

What I expect is all of the run_loop methods to run in parallel, however what I get from logs is that they are executed sequentially by Ray cluster:

Scheduled loops, waiting for finish...
(SimulationWorkerActor pid=11588, ip=10.244.3.10) Started loop for split 0
(SimulationWorkerActor pid=11588, ip=10.244.3.10) Finished loop for split 0 in 2.4789621829986572s
(SimulationWorkerActor pid=11412, ip=10.244.2.10) Started loop for split 1
(SimulationWorkerActor pid=11412, ip=10.244.2.10) Finished loop for split 1 in 2.550433397293091s
(SimulationWorkerActor pid=9168, ip=10.244.0.10) Started loop for split 2
(SimulationWorkerActor pid=9168, ip=10.244.0.10) Finished loop for split 2 in 2.5661652088165283s
(SimulationWorkerActor pid=8806, ip=10.244.4.11) Started loop for split 3
(SimulationWorkerActor pid=8806, ip=10.244.4.11) Finished loop for split 3 in 2.499436140060425s

Why is this happening? How do I make my actors work independently, in parallel?

My setup:

Ray 2.4.0, cluster runs in minikube on M2 mac

sangcho · September 27, 2023, 12:02am

dirtyValera:

    async def run_loop(self, loop: Loop, split_id: int):
        self.split_id = split_id
        self.loop = loop
        start = time.time()
        print(f'Started loop for split {split_id}')
        loop.run() # blocking cpu intensive computation
        self.run_loop_time = time.time() - start
        print(f'Finished loop for split {split_id} in {self.run_loop_time}s')

It is the expected behavior of asyncio. Unless you use await, the context is not switched.

sangcho · September 27, 2023, 12:02am

You can observe the same behavior when you use asyncio.run

sangcho · September 27, 2023, 12:03am

If you want multi threads, you can use AsyncIO / Concurrency for Actors — Ray 3.0.0.dev0 instead. But note that python’s concurrency is always limited by GIL, and only 1 thread can run at a time

dirtyValera · September 27, 2023, 1:46am

@sangcho
Where should I use await exactly? I use ray.get which should do the same?

Also please note, I’m not calling multiple blocking async functions in the context of the same actor/python process (in this case I agree, they will be sequential); I spawn multiple actors first (which as I understand should act as independent python processes), and then on each of them I run blocking operation, in this case I expect those operations to be executed in parallel since they run on separate processes. GIL locks per python process/actor, not sure why this would be applicable in my case. What am I missing?

sangcho · September 27, 2023, 7:10pm

dirtyValera:

actors = [SimulationWorkerActor.options(num_cpus=1).remote() for _ in range(len(self.generators))]
print(f'Inited {len(actors)} worker actors')

refs = [actors[i].run_loop.remote(
    loop=Loop(...),
    split_id=i
) for i in range(len(actors))]
print(f'Scheduled loops, waiting for finish...')

# wait for all runs to finish
ray.get(refs)

Hmm sorry, I think I misread the question.

I don’t think in this case, it should run sequentially. What’s your Loop here? Can you give me the script I can actually run and repro this?

Topic		Replies	Views
Concurrent parallel tasks with shared state actors issue Ray Core	1	203	December 24, 2023
maximize the parallelization efficiency using Python ray ActorPool?	4	600	November 15, 2022
Understanding async actors Ray Core	1	390	January 17, 2022
Multiple requests to a single Actor Ray Core	5	397	September 29, 2021
Actor Scheduling Bug? Ray Core	2	135	February 27, 2024

[Core] Why actors are executed sequentially?

Related Topics