I recently started using ray ActorPool to parallelize my python code on my local computer (using the code below), and it’s definitely working. Specifically, I used it to process a list of arguments and return a list of results (Note that depending on the inputs, the “process” function could take different amounts of time).
However, while testing the script, it seems in this way the processes are sort of “blocking” each other, in that if there’s one process that takes a long time, it almost seems other cores would just stay more or less idle. Although it’s definitely not completely blocking, as running it this way still saves a lot of time compared to just running on one core, I found that many of the processors would just stay idle (more than half cores with <20% usage) despite I’m running this script on all cores (16 cores). This is especially observable when there is a long process, in which case there are only one or two cores that are actually active. Also, the total amount of time saved is nowhere near 16x
pool = ActorPool(actors) poolmap = pool.map( lambda a, v: a.process.remote(arg), args, ) result_list = [a for a in tqdm(poolmap, total=length)]
I suspect this is because the way I used to get the result values is not optimal (last line), but not sure how to make it better. Could you guys help me improve it?