What is the expected startup time of worker processes?

Hello everyone, We try to use Ray in maybe a bit of an unconventional way. We do have calculations which are a bit to big to run them on a single node, so we try to distribute them over multiple nodes using Ray. Because the software need some environment set up around it (files on the right node), we use a placement group for that and the application will always use the same amount of workers as Worker Nodes available. We run the Ray components inside of containers on K8s and start them up using the CLI and do not use the operator.
Now the caveat: We want to use the calculations for interactive use and therefore we are a bit sensitive on latency of the setup.
During testing I figured out, that if I call endpoint shortly after another, Ray does not add any noticeable overhead. When I wait a bit longer (like 2 seconds), the next request takes between 2.5 and 3.1 seconds to execute.
I already pinned it down to the way Ray handles leases.
When I run into the rather short timeout, the Worker process is killed by the Raylet, so a subsequent request needs to startup a new Worker process to execute the request.I already set RAY_enable_worker_prestart to 1 and RAY_worker_lease_timeout_milliseconds to 50000 but both did not have any effect on the timings.So my actual question is the following: What is the expected time until a calculation is executed on a worker, when there is no lease and the Worker and Head are on different nodes?
I try to get a feeling of how long is normal for that process and how to tune that values.
From the Architecture document I got, that there is quite a bit of back and forth involved between the Raylet and the Head, so the time does not look completely unreasonable, but 3 seconds is also not too fast for such a startup in my view.

Are you using a static K8s cluster?

We are managing the worker nodes and head node ourself, so I think it is considered static.
It does not scale automatically to new nodes.