I get this error when i try connecting a worker node to head. the runtime_env_agent port is open and accessible from head
runtime_env_agent_client.cc:293: The raylet exited immediately because the runtime env agent timed out when Raylet try to connect to it. This can happen because the runtime env agent was never started, or is listening to the wrong port. Read the log cat /tmp/ray/session_latest/logs/runtime_env_agent.log
. You can find the log file structure here Configuring Logging — Ray 3.0.0.dev0.
Hello! Usually this error happens coz your Raylet can’t connect to the runtime agent. This is usually because of port issues or if the runtime agent isn’t started correctly. The error message suggests checking /tmp/ray/session_latest/logs/runtime_env_agent.log
for any errors - can you check it and let me know what it says?
Thanks!
@christina the issue seems to be autoscaler removing the wokers as it is not part of the workers it created
2025-06-26 23:11:00,547 INFO load_metrics.py:159 – LoadMetrics: Removed 1 stale ip mappings: {‘202.123.453.123’} not in {‘10.0.3.4’, ‘10.0.3.5’}