1. Severity of the issue: (select one)
Low: Annoying but doesn’t hinder my work.
2. Environment:
- Ray version: ray, version 2.52.1
- Python version: Python 3.11.14
- OS: Win11
- Cloud/Infrastructure: None
- Other libs/tools (if relevant): PyTorch, Lightning, WanDB
3. What happened vs. what you expected:
- Expected: No error on rpc
- Actual: Errors on rpc.
I keep seeing:
(pid=gcs_server) \[2025-12-10 10:32:21,503 E 52384 54472\] (gcs_server.exe) gcs_server.cc:303: Failed to establish connection to the event+metrics exporter agent. Events and metrics will not be exported. Exporter agent status: RpcError: Running out of retries to initialize the metrics agent. rpc_code: 14
(raylet) \[2025-12-10 10:32:25,037 E 94496 93984\] (raylet.exe) main.cc:979: Failed to establish connection to the metrics exporter agent. Metrics will not be exported. Exporter agent status: RpcError: Running out of retries to initialize the metrics agent. rpc_code: 14
in my training process.
I tried to:
if ray.is_initialized():
log.warning("Ray was already initialized. Shutting down to apply new configuration...")
ray.shutdown()
if not ray.is_initialized():
log.info("Initializing Ray with dashboard disabled to prevent connection errors...")
ray.init(include_dashboard=False, ignore_reinit_error=True)
also tried to:
$env:RAY_include_dashboard = 0
to slient it since it looks like something to do with dashboard. But it dose not work.
It is not preventing me from tunning, but I really want to know what went wrong.![]()