I want to be able to see how much memory/cpu a given actor/task is currently using (and possibly log this data/do certain application/scheduling decisions based on it). I would also like to programatically track shared obj store usage. Is there a python API for this?
@dirtyValera
Unfortunately, we currently don’t support cpu/memory usage per actor/task, but this is something we are looking into. One of the blockers is the cardinality of such data given the number of tasks/actor could be rather large in Ray.
I would also like to programatically track shared obj store usage. Is there a python API for this?
AFAIK, you could probably do the below (kind of hacky unfortunately):
For a cluster level resource usage, you could probably parse the obj store usage from autoscaler’s status. See example query usage from ray statushere
Or if you have prometheus set up, you could also scrape the ray_object_store_memory programmatically metric
If you could share a bit more on your usecase, that would be great. We are actively working on the resources observability in the coming releases so knowing the usecases would help us prioritize