I’ve trying to adapt my service to use Ray Workflow instead, but I’m running into some issues – getting OOMs due to memory not being released by Ray::IDLE processes (I posted the issue here). Besides that, I still have the issue that each invocation of the workflow needs to load the model again from scratch, which is quite wasteful, since I always need the same model.
So, I’m still interested in my original questions from the first post;
-
Is there a risk of “losing” work by doing
deployment.remote()
but notawait
ing its result for example? -
How large is the internal queue receiving requests when doing
deployment.remote()
? Is there a risk of it dropping requests?