maximize the parallelization efficiency using Python ray ActorPool?

I recently started using ray ActorPool to parallelize my python code on my local computer (using the code below), and it’s definitely working. Specifically, I used it to process a list of arguments and return a list of results (Note that depending on the inputs, the “process” function could take different amounts of time).

However, while testing the script, it seems in this way the processes are sort of “blocking” each other, in that if there’s one process that takes a long time, it almost seems other cores would just stay more or less idle. Although it’s definitely not completely blocking, as running it this way still saves a lot of time compared to just running on one core, I found that many of the processors would just stay idle (more than half cores with <20% usage) despite I’m running this script on all cores (16 cores). This is especially observable when there is a long process, in which case there are only one or two cores that are actually active. Also, the total amount of time saved is nowhere near 16x

pool = ActorPool(actors)
poolmap = pool.map(
    lambda a, v: a.process.remote(arg),
    args,
)
result_list = [a for a in tqdm(poolmap, total=length)]

I suspect this is because the way I used to get the result values is not optimal (last line), but not sure how to make it better. Could you guys help me improve it?

just want to bump it, really looking for help here :sweat_smile:

Hmm, looking through ray.util.actor_pool — Ray 2.0.1, it seems like the cores should be evenly used. Perhaps there is a bug somewhere. I’ll try to repro soon to better understand what’s going on.

@cade Sounds good, Thank you so much!
just wanted to make sure, am I using the actor pool class right, especially the way of getting results?
and are there alternative ways to parallelize the process using ray that I should try?

Hi @cade , how is it going? Just want to kindly check back with you in case there’s any update.
Please let me know if you need more information to better reproduce the bug
On the other hand, I would really appreciate it if you could point me to an alternative way of doing this, since I know the actor_pool class is about to be deprecated.

Thanks again!