1. Severity of the issue:
Medium: Significantly affects my productivity but can find a workaround.
2. Environment:
Ray version: 2.47
Python version: 3.12.11
OS: Rocky Linux 8
Other libs/tools (if relevant): uv 0.9.7
3. What happened vs. what you expected:
Expected:
import ray
ray.init()
runs without errors
Actual:
import ray
ray.init()
Raises psutil.AccessDenied due to psutil.Process().parents()here. If I handle this error gracefully with a patch, everything works as expected.
Please note that a similar problem arises again in ray 2.51.0 due to psutil.Process(dashboard_pid).children()here. Again, if I handle this error gracefully with a patch, everything works as expected.
I do not understand the intricacies of psutil especially in the context of ray, but I see that there is error handling for psutil.AccessDenied in many parts of the codebase. Also, catching psutil.AccessDenied is simple and harmless, so the fix is easy. However, I am hesitant to submit a fix on GitHub because this issue arises only on an Nvidia DGX server with Rocky Linux 8 on which I do not have admin privileges. I am unable to reproduce this issue anywhere else.
It seems likely that your issue is due to psutil.AccessDenied exceptions when calling psutil.Process().parents() or psutil.Process().children() in Ray, which only occurs on your Nvidia DGX server with Rocky Linux 8 and not elsewhere. The Ray codebase already handles psutil.AccessDenied in several places, and your workaround (catching the exception) appears to be safe and effective, as you observed no negative side effects after patching locally. However, the specific lines you mention (parents/children) are not universally wrapped in try/except for AccessDenied, which could explain your experience.
My understanding is that this is a system-level permission issue, and Ray’s maintainers have previously discussed similar problems, especially on systems where users lack certain privileges. Since your patch works and is consistent with existing Ray error handling patterns, it seems like a reasonable fix, but the maintainers may want to know more about your environment before merging such a change. Would you like more detail on where and how psutil.AccessDenied is handled in the Ray codebase, or guidance on submitting a minimal, safe PR?