Documentation around runtime enivronments

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

if i specify pip packages in the runtime_env when/where does this get built?

I want to preinstall some larger packages and then allow users to bring their own for smaller ones, this doesn’t seem to work as the preinstalled packages get a module not found

From the documentation page I see these two passages which seem to contradict each other

Runtime environments can be used on top of the prepared cluster environment from the first approach. Runtime environments also allow you to set dependencies per-task, per-actor, and per-job on a long-running Ray cluster.

However, using runtime environments you can dynamically specify packages to be automatically downloaded and installed in an isolated virtual environment for your Ray job, or for specific Ray tasks or actors.

from Environment Dependencies — Ray 2.8.0

Unless I made a mistake the second passage seems to reflect reality, however the first passage would be a nice feature if it could be implemented. Having the choice could be good.

The downside of isolated environments is that I now have to carry heavy dependencies around with the runtime env and I have seen hangs of >30mins when trying to deploy a runtime environment with pytorch + dagster which are both fairly heavy.

If I could specify resources to build the environment that may be a happy medium.

related slack thread

@shrekris question about runtime_envs

Hi @Nintorac, thanks for pointing out the doc inconsistency. Indeed, the environment is not isolated, unless you use the "conda" field. I’ll make a note to remove this from the docs.

In particular, you can preinstall your “heavy” dependencies on the cluster before starting Ray, either by using setup_commands in the Ray cluster.yaml (Cluster YAML Configuration Options — Ray 1.13.0), or by making them part of your docker image.

Later, when anyone uses the "pip" field of runtime_env, those packages will be installed at runtime, but the existing “heavy” dependencies will still be importable.

Can you give more details about the ModuleNotFound error you saw?

Ok great, thanks for the clarification around docs

I was moving quickly so likely user error. Tried to reproduce now but I am getting errors from elsewhere and about to go on break so it’s a bit of a can of worms that I’ll leave for when I get back but I’ll try to remember to update here when I get around to it