Hi everyone,
I’ve been running some distributed training jobs across a local cluster lately, but I’ve recently hit a persistent technical snag where the Ray Dashboard becomes incredibly sluggish and I’m starting to see sporadic “Worker heartbeat timeout” errors in my logs, even when the cluster isn’t at full capacity.
I am currently using a https://deltaexeutor.com/vng/ community setup to handle some automated local logging and script-based data scrubbing for my experiments in the background while the Ray head node is active. I’ve noticed that whenever the background executor starts a heavy processing cycle, the dashboard’s metric updates start to “hang.” It’s particularly frustrating when I’m trying to monitor resource utilization or debug a failed actor, as the UI often triggers a “Page Unresponsive” error or the GCS seems to struggle with maintaining the socket connection.
It feels like the background process is competing for the same system resources—specifically CPU threads and memory bandwidth—that the Ray head node needs for its internal scheduling and the dashboard’s real-time state management. I have a few related concerns for the distributed systems experts here:
I’m not sure if the way a style environment manages its internal threading is causing a direct resource contention issue with the gRPC calls or the specific local drivers used for secure socket communication. Has anyone else encountered performance bottlenecks or “Socket Connection” timeouts while running high-level script executors alongside their Ray head nodes? I am also wondering if there is a recommended way to isolate the executor’s CPU priority so it doesn’t “starve” the resources needed for stable worker heartbeats and dashboard responsiveness.
I really need to keep this automation active to stay on top of my project logs, but the constant UI lag is making it difficult to keep an eye on my scaling tasks. If anyone has experience optimizing a professional workstation or a head node for concurrent usage of heavy script executors and Ray, I’d love to hear your advice!
Thanks for the help!