Is dashboard/agent.py supposed to be at 100% CPU?

0939013 · April 19, 2024, 9:55pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

I’m using ray.remote() to launch a number of parallel tasks, and finding that the rate of task completion is unexpectedly bottlenecked. I noticed that dashboard/agent.py is constantly at or near 100% CPU. Does that indicate some kind of Ray internals are the limiting factor? If so, is there a way to eliminate unnecessary work internal to Ray? For example, I don’t need the dashboard functionality and just need maximum throughput. Any suggestions welcome, thanks!

ruisearch42 · April 19, 2024, 10:20pm

I think currently Ray always starts the dashboard. Dashboard should not take 100% CPU, something weird may be happening. Btw, how large scale is your job?

0939013 · April 19, 2024, 10:40pm

What is a good way to dig into Ray to investigate?

In our job, about 500-1000 tasks are running at any given time, however they are mostly blocked on external service requests so very few are actually executing on the ray host machine at any given time.

Kai-Hsun_Chen · April 23, 2024, 4:10am

Hi @0939013, if you can make a easy reproduction, I can take a look at the issue. Thanks!

Liquidmasl · April 4, 2025, 8:41am

Ima necro this post exactly a year later, but its the only resource I can find…

Here with the same issue, but also without minimal reproducer sadly.
OP have you resolved the issue eventually?

I bring pictures!
I have a bunch or ray remote functions running as well, but they seam to run very slow, or at least not really using any resources. Meanwhile 1 physical core is at 100% all the time, and btop tells me its the ray dashboard agent.
Could it be that it is blocking the rest?
why would that happen?

The dashboard is not switched on btw…

Topic		Replies	Views
CPU usage above 100% Ray Core	8	1211	March 17, 2023
Ray dashboard is hanging Dashboard, Monitoring & Debugging	10	1205	June 1, 2023
Little speed up from 100 to 300 cores Ray Core	4	387	July 5, 2022
Processing performance of tasks Ray Core	14	808	March 8, 2021
Limiting bandwidth Ray Core	0	165	February 15, 2024

Is dashboard/agent.py supposed to be at 100% CPU?

Related topics