Ray python parallel processing of deep learning model on multiple docker

I have 3 docker containers with each container consisting a deep learning model (tensorflow).
in each docker container I have a job of batch inference over 1000 images lets say.
I have create batch inference actor and its parallelising the batches over multiple cpus.

which scenario will be most suited for the above task.

  1. each docker container will have separate ray cluster (in each container python code)

  2. ray cluster will be configured locally and each docker container will connect to it.

  3. ray cluster will be configured as separate docker container and each docker container will connect to it.

  4. one of the docker out of three will consist ray and three of the containers will connect it.


I think the most common pattern is 4. I think you can look at this doc Launching an On-Premise Cluster — Ray 3.0.0.dev0. When you deploy via docker container, make sure all necessary ports are open. Configuring Ray — Ray 2.0.0

1 Like

Thanks this helps!

I just checked that the dependency on these three docker container would be different like tensorflow , pytorch and someother

On top of it, what if 4th container is added as well with different dependencies

I believe ray cluster would need those dependencies as well, right?
Any standard way to handle this? :slight_smile:

It is recommended to have the same dependencies on all head & worker nodes. Alternatively, you can use runtime environment to sync dependencies Environment Dependencies — Ray 3.0.0.dev0