I am getting the following errors (raylet.err):
The raylet fate shares with the agent. This can happen because
- The version of `grpcio` doesn’t follow Ray’s requirement. Agent can segfault with the incorrect `grpcio` version. Check the grpcio version `pip freeze | grep grpcio`.
- The agent failed to start because of unexpected error or port conflict. Read the log `cat /tmp/ray/session_latest/logs/{dashboard_agent|runtime_env_agent}.log`.
- The agent is killed by the OS (e.g., out of memory).
[2025-10-14 15:40:35,254 E 185 254] (raylet) agent_manager.cc:87: The raylet exited immediately because one Ray agent failed, agent_name = runtime_env_agent.
The raylet fate shares with the agent. This can happen because
- The version of `grpcio` doesn’t follow Ray’s requirement. Agent can segfault with the incorrect `grpcio` version. Check the grpcio version `pip freeze | grep grpcio`.
- The agent failed to start because of unexpected error or port conflict. Read the log `cat /tmp/ray/session_latest/logs/{dashboard_agent|runtime_env_agent}.log`. You can find the log file structure here Configuring Logging — Ray 3.0.0.dev0 .
- The agent is killed by the OS (e.g., out of memory).
ray_client_server.err:
(base) ray@blue_server_ray:/blue_data/data/ray/session_latest/logs$ cat ray_client_server.err
2025-10-14 15:39:26,767 INFO server.py:901 – Starting Ray Client server on 172.18.0.4:10001, args Namespace(host=‘172.18.0.4’, port=10001, mode=‘proxy’, address=‘172.18.0.4:6380’, redis_username=None, redis_password=None, runtime_env_agent_address=‘http://172.18.0.4:50278’)
2025-10-14 15:40:33,975 INFO proxier.py:697 – New data connection from client 936507d607a24baba9c084f0cf1534c0:
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1760481634.007812 352 fork_posix.cc:71] Other threads are currently calling into gRPC, skipping fork() handlers
2025-10-14 15:40:34,017 INFO proxier.py:342 – SpecificServer started on port: 23000 with PID: 359 for client: 936507d607a24baba9c084f0cf1534c0
2025-10-14 15:40:35,603 ERROR proxier.py:825 – Proxying Logstream failed!
Traceback (most recent call last):
File “/home/ray/anaconda3/lib/python3.10/site-packages/ray/util/client/server/proxier.py”, line 822, in Logstream
for resp in resp_stream:
File “/home/ray/anaconda3/lib/python3.10/site-packages/grpc/_channel.py”, line 543, in _next_
return self._next()
File “/home/ray/anaconda3/lib/python3.10/site-packages/grpc/_channel.py”, line 969, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = “Socket closed”
debug_error_string = “UNKNOWN:Error received from peer ipv4:127.0.0.1:23000 {grpc_message:“Socket closed”, grpc_status:14}”
>
2025-10-14 15:40:35,604 ERROR proxier.py:750 – Proxying Datapath failed!
Traceback (most recent call last):
File “/home/ray/anaconda3/lib/python3.10/site-packages/ray/util/client/server/proxier.py”, line 743, in Datapath
for resp in resp_stream:
File “/home/ray/anaconda3/lib/python3.10/site-packages/grpc/_channel.py”, line 543, in _next_
return self._next()
File “/home/ray/anaconda3/lib/python3.10/site-packages/grpc/_channel.py”, line 969, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = “Socket closed”
debug_error_string = “UNKNOWN:Error received from peer {grpc_status:14, grpc_message:“Socket closed”}”
>
2025-10-14 15:40:56,928 INFO proxier.py:392 – Specific server 936507d607a24baba9c084f0cf1534c0 is no longer running, freeing its port 23000
2025-10-14 15:41:05,611 INFO proxier.py:769 – 936507d607a24baba9c084f0cf1534c0 last started stream at 1760481633.9639764. Current stream started at 1760481633.963976
1. Severity of the issue: (select one)
None: I’m just curious or want clarification.
Low: Annoying but doesn’t hinder my work.
Medium: Significantly affects my productivity but can find a workaround.
x High: Completely blocks me.
2. Environment:
-
Ray version: 2.46.0
-
Python version: 3.10.12
-
OS: 15.0.1 (24A348)
-
Cloud/Infrastructure:
-
Other libs/tools (if relevant):Package Version
---------------------------------- ---------
adlfs 2023.8.0
aiohappyeyeballs 2.6.1
aiohttp 3.11.16
aiohttp-cors 0.7.0
aiosignal 1.3.1
amqp 5.3.1
annotated-types 0.6.0
anyio 3.7.1
archspec 0.2.5
async-timeout 4.0.3
attrs 25.1.0
azure-common 1.1.28
azure-core 1.29.5
azure-datalake-store 0.0.53
azure-identity 1.17.1
azure-storage-blob 12.22.0
billiard 4.2.1
boltons 24.0.0
boto3 1.29.7
botocore 1.32.7
Brotli 1.1.0
cachetools 5.5.2
celery 5.5.3
certifi 2025.1.31
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
click-didyoumean 0.3.1
click-plugins 1.1.1.2
click-repl 0.3.0
cloudpickle 2.2.0
colorama 0.4.6
colorful 0.5.5
conda 25.9.0
conda-libmamba-solver 25.4.0
conda-package-handling 2.4.0
conda_package_streaming 0.11.0
cryptography 44.0.3
cupy-cuda12x 13.1.0
Cython 0.29.37
distlib 0.3.7
distro 1.9.0
dm-tree 0.1.8
exceptiongroup 1.3.0
Farama-Notifications 0.0.4
fastapi 0.115.12
fastrlock 0.8.2
filelock 3.17.0
flatbuffers 23.5.26
frozendict 2.4.6
frozenlist 1.4.1
fsspec 2023.12.1
google-api-core 2.24.2
google-api-python-client 2.111.0
google-auth 2.23.4
google-auth-httplib2 0.1.1
google-cloud-core 2.4.1
google-cloud-storage 2.14.0
google-crc32c 1.5.0
google-oauth 1.0.1
google-resumable-media 2.6.0
googleapis-common-protos 1.61.0
grpcio 1.74.0
gymnasium 1.1.1
h11 0.16.0
h2 4.1.0
hpack 4.0.0
httplib2 0.20.4
httptools 0.6.4
hyperframe 6.0.1
idna 3.7
importlib-metadata 6.11.0
isodate 0.6.1
Jinja2 3.1.6
jmespath 1.0.1
jsonpatch 1.33
jsonpointer 3.0.0
jsonschema 4.23.0
jsonschema-specifications 2024.10.1
kombu 5.5.4
libmambapy 2.3.2
lz4 4.3.3
markdown-it-py 2.2.0
MarkupSafe 2.1.3
mdurl 0.1.2
memray 1.10.0
menuinst 2.3.1
msal 1.28.1
msal-extensions 1.2.0b1
msgpack 1.0.7
multidict 6.0.5
numpy 1.26.4
opencensus 0.11.4
opencensus-context 0.1.3
opentelemetry-api 1.34.1
opentelemetry-exporter-prometheus 0.55b1
opentelemetry-proto 1.27.0
opentelemetry-sdk 1.34.1
opentelemetry-semantic-conventions 0.55b1
ormsgpack 1.7.0
packaging 23.0
pandas 1.5.3
pip 25.2
platformdirs 3.11.0
pluggy 1.5.0
portalocker 2.8.2
prometheus-client 0.19.0
prompt-toolkit 3.0.41
propcache 0.3.0
proto-plus 1.22.3
protobuf 4.25.8
psutil 5.9.6
py-spy 0.4.0
pyarrow 19.0.1
pyasn1 0.5.1
pyasn1-modules 0.3.0
pycosat 0.6.6
pycparser 2.21
pydantic 2.11.7
pydantic_core 2.33.2
Pygments 2.18.0
PyJWT 2.8.0
pyOpenSSL 25.0.0
pyparsing 3.1.1
PySocks 1.7.1
python-dateutil 2.8.2
python-dotenv 1.1.1
pytz 2022.7.1
PyYAML 6.0.1
ray 2.50.0
referencing 0.36.2
requests 2.32.3
rich 13.3.2
rpds-py 0.22.3
rsa 4.7.2
ruamel.yaml 0.18.15
ruamel.yaml.clib 0.2.12
s3transfer 0.8.0
scipy 1.11.4
setuptools 80.9.0
six 1.16.0
smart-open 6.2.0
sniffio 1.3.1
starlette 0.46.2
tensorboardX 2.6.2.2
tqdm 4.67.1
truststore 0.10.0
typing_extensions 4.12.2
typing-inspection 0.4.1
tzdata 2025.2
uritemplate 4.1.1
urllib3 1.26.19
uvicorn 0.22.0
uvloop 0.21.0
vine 5.1.0
virtualenv 20.29.1
watchfiles 0.19.0
wcwidth 0.2.13
websockets 11.0.3
wheel 0.45.1
yarl 1.18.3
zipp 3.19.2
zstandard 0.25.0
3. What happened vs. what you expected:
- Expected: ray.init to initialize
- Actual: receiving error