Keep long running workers to handle slow initialization

niku · July 21, 2021, 4:09pm

Hello, whenever we start a ray worker, we need to run some initialization code which takes around 5 seconds.

We were wondering if there’s a way to always have a few IDLE ray workers which are already initialized so that we don’t have to pay the initialization cost. Currently, there doesn’t seem to be a way to control how long the IDLE workers are around for. We also couldn’t find any other workout to avoid this initialization cost slowing our whole computation.

Any advice will be greatly appreciated.

asm582 · July 21, 2021, 8:36pm

Hello, maybe you can try the minWorkers setting in values.yaml file to set to the number of workers you always need in the cluster. setting large idleTimeoutMinutes param can also help to keep idle workers in an autoscaled cluster.

niku · July 21, 2021, 11:35pm

Thank you for your response. That could technically work, only thing missing would be a way to run an initialization function on workers at startup. run_function_on_all_workers doesn’t seem to work

asm582 · July 22, 2021, 1:47am

May be try using actors and have init method inside the actor class

niku · July 22, 2021, 5:55pm

Thanks again for your response. We considered that, the only problem with using actors is that we are planning to use modin over ray (Scale your pandas workflow by changing a single line of code — Modin 0.10.1+14.gb58663b.dirty documentation). Using actors might mean that we cannot plugin modin directly and might need to fork it and have it use our custom backend. Please let us know if that understanding is incorrect.

sangcho · July 28, 2021, 6:28pm

I believe it is currently not supported. But there’s a workaround (that is not recommended since it doesn’t use the public APIs, but you can use as a workaround). You can call certain functions on all workers when they start by calling ray.worker.global_worker.run_function_on_all_workers(function) Note that the API stability is not guaranteed for this approach.

Topic		Replies	Views
Why is there a lot of Ray:: IDLE in my ray process Ray Core	8	2294	May 11, 2025
Long start up time Ray Core	4	203	January 6, 2024
Ray Worker Initialiser Ray Clusters	0	376	June 13, 2023
Creation of new Ray workers Ray Core	5	311	December 15, 2022
Worker nodes are IDLE	1	563	January 26, 2022

Keep long running workers to handle slow initialization

Related topics