Ray spawns too many actors

Hello,
I was wondering if anyone could give any insight into this problem with ray 2.10. (I can’t install later versions because I am limited to python 3.8)

I am trying to create a pipeline of Actors to process a list of items. The list order is unimportant, but the output of the proceeding actor is fed into the next.
I have tried this by creating an ActorPool for each stage in the pipeline and generating results using map_unordered. Unfortunately, I get errors that a large number of processes are created, and it refers me to issue 3644 for workarounds.
I’ve tried restructing my code to bypass the map_unordered: essentially, calling ActorPoolN+1.submit() for each item returned by ActorPoolN, after first checking that an item exists and calling get_next_unordered(), yet the problem still persists.

Each Actor necessarily calls get() on some elements of the item it receives, but to my mind they should exist because they’re the result of the Actor successfully completing processing by virtue of returning from get_next_unordered()

Ideally I’d like to stop ray from spawning any new Actors, and I am a little surprised that it tries to when I expliticly create a specific number in the ActorPool(). Is it a problem that ray allocates only one actor to each processor and I’ve created too many actors across my pipeline that means actors at the end of the pipeline arent executing to clear the list?

any hints would be greatly appreciated

kind regards
Joihn

Oof - limitation on Python3.8 is going to be a challenge… any timeline on your side to upgrade? We’ve been actually encouraged from the community to go to higher Python versions ASAP so it’s going to be hard to justify a fix for this even if it does end up being a Bug.

That aside can you share a repro script; if I have time this week I’ll try to setup a new conda env and see if I can reproduce…

TPM @ Anyscale here