Issues with setting up runtime environment using "uv run"

bln_prasad · November 19, 2025, 10:09am

I’m experimenting with running my application job using UV, but it’s not working as expected.
When I submit the job using: uv python pin 3.9 && uv run -m myapp.mytask along with the runtime configuration below, the job remains in a PENDING state. There are no logs available that indicate what is happening internally.

runtime_env={
   "env_vars": {    
   "py_executable": "uv python pin 3.9 && uv run", ==> tried with just "uv run"
   "working_dir": "../../nile-intelligence",
   "excludes":[".venv"]
}

I tested the setup without using the "py_executable" configuration, and it works without any issues. However, since I need to run tasks across nodes, I have to configure the same runtime environment on all nodes. When I use "py_executable", the job hangs, and there are no clear indications of where it gets stuck.

I’m using the Ray 2.46 image for the cluster. Are there any known issues with this version aside from the inability to stop pending jobs?

Any guidance or suggestions would be appreciated.

RunLLM · November 19, 2025, 10:10am

Yes, there are known issues with using "py_executable": "uv run" in Ray runtime environments, especially in Ray 2.46. Jobs can hang in the PENDING state with no logs, and stopping such jobs is also problematic. This is a common pain point reported by users, and improvements/fixes have been made in later Ray versions (see issue). Additionally, you should not set py_executable to include shell logic like uv python pin 3.9 && uv run—Ray expects a single executable, not a shell command chain (see docs). The recommended approach is to use uv run directly as the entrypoint and avoid setting py_executable unless you have a very specific advanced use case.

For best results, upgrade to Ray ≥2.47 and use the standard pattern:

ray job submit --working-dir . -- uv run -m myapp.mytask

This ensures both driver and workers use the uv environment as specified in your pyproject.toml (docs). If you must use py_executable, only set it to "uv run" (not a shell chain), and ensure your cluster image has uv installed and the correct Python version.

Would you like a step-by-step breakdown or troubleshooting checklist?

Sources:

Hint: Mention @RunLLM in the post for followups.

bln_prasad · November 24, 2025, 12:53pm

@RunLLM Thanks. I’ve switched to ray==2.52.0. with this i’m seeing below error on worker threads. not sure why only this import error((raylet) ModuleNotFoundError: No module named ‘ray’) is coming. this is happening only when i include ““py_executable”: “uv run”. Is this expected can you please suggest on this error?

0%| | 0/2 [00:00<?, ?it/s]e[A(raylet, ip=10.6.121.149) Using CPython 3.10.19 interpreter at: /home/ray/anaconda3/bin/python3
(raylet, ip=10.6.121.149) Creating virtual environment at: .venv
(raylet, ip=10.6.121.149) warning: Failed to hardlink files; falling back to full copy. This may lead to degraded performance.
(raylet, ip=10.6.121.149) If the cache and target directories are on different filesystems, hardlinking may not be supported.
(raylet, ip=10.6.121.149) If this is intentional, set export UV_LINK_MODE=copy or use --link-mode=copy to suppress this warning.
(raylet) There are tasks with infeasible resource requests that cannot be scheduled. See Scheduling — Ray 2.52.0 for more details. Possible solutions: 1. Updating the ray cluster to include nodes with all required resources 2. To cause the tasks with infeasible requests to raise an error instead of hanging, set the ‘RAY_enable_infeasible_task_early_exit=true’. This feature will be turned on by default in a future release of Ray.
(autoscaler +40s) Removing 1 nodes of type workergroup (max number of worker nodes reached).
(autoscaler +40s) Resized to 6 CPUs.
(raylet) There are tasks with infeasible resource requests that cannot be scheduled. See Scheduling — Ray 2.52.0 for more details. Possible solutions: 1. Updating the ray cluster to include nodes with all required resources 2. To cause the tasks with infeasible requests to raise an error instead of hanging, set the ‘RAY_enable_infeasible_task_early_exit=true’. This feature will be turned on by default in a future release of Ray.
(autoscaler +45s) Removing 1 nodes of type workergroup (max number of worker nodes reached).
(autoscaler +45s) Resized to 6 CPUs.
(raylet) [2025-11-24 04:40:34,327 E 473 473] (raylet) worker_pool.cc:590: Some workers of the worker process(50830) have not registered within the timeout. The process is dead, probably it crashed during start.
(raylet) There are tasks with infeasible resource requests that cannot be scheduled. See Scheduling — Ray 2.52.0 for more details. Possible solutions: 1. Updating the ray cluster to include nodes with all required resources 2. To cause the tasks with infeasible requests to raise an error instead of hanging, set the ‘RAY_enable_infeasible_task_early_exit=true’. This feature will be turned on by default in a future release of Ray.
(autoscaler +50s) Removing 1 nodes of type workergroup (max number of worker nodes reached).
(autoscaler +50s) Resized to 6 CPUs.
(raylet) Traceback (most recent call last):
(raylet) File “/home/ray/anaconda3/lib/python3.10/site-packages/ray/_private/workers/default_worker.py”, line 8, in
(raylet) import ray
(raylet) ModuleNotFoundError: No module named ‘ray’

RunLLM · November 24, 2025, 12:53pm

The error (raylet) ModuleNotFoundError: No module named 'ray' on worker startup means the Ray package is not installed in the Python environment that the worker process is using. This is a common issue when Ray creates a new virtual environment (e.g., .venv) for the runtime environment, but the Ray package itself is not present in that environment (see discussion, see also).

To fix this, ensure that:

The Ray package is included in your runtime environment dependencies (e.g., in your pyproject.toml, requirements.txt, or runtime_env pip/uv list).
If using uv, make sure ray is listed as a dependency so it is installed in the worker’s environment.
Avoid situations where Ray creates a new .venv that does not inherit the parent environment’s packages.

Example fix (for runtime_env):

runtime_env = {
    "pip": ["ray==2.52.0", ...],  # add other dependencies as needed
    ...
}

Or, if using uv, ensure ray is in your pyproject.toml dependencies.

Would you like more detailed troubleshooting steps?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
Uv + ray in example is not working	1	209	July 2, 2025
Raylet retry forever when the submitted job fails at runtime_env creation Ray Client	4	128	October 31, 2025
uv integration doesn't work with Ray Client (gRPC) Ray Client	2	58	October 29, 2025
Raylet worker processes are failing Ray Core	3	453	March 5, 2025
Ray 1.7.0 ray.init(runtime_env=) kills cluster (was: cluster stuck on "The actor or task with ID [] cannot be scheduled right now") Ray Core	5	1290	October 18, 2021

Issues with setting up runtime environment using "uv run"

Related topics