Monitor state of submitted Tasks

Hi,

I was wondering if there’s some built-in feature to monitor the state of the submitted tasks to a @ray.remote function. By “state” I mean if the task is either “pending”, “running” or “finished”. (similar to what is already being reported when using ray.tune)

My use case is quite simple: (1) a function submits N ray tasks (these will be scheduled and run in parallel just like ray’s core functionality); (2) however, I don’t have/want a ray.get() blocking call that waits for these tasks to complete; (3) in other bits of my code I need to check at any given point how many of the submitted runs are “running”.

I cannot find a built-in functionality for this so this is how I have implemented it:

  • I have an auxiliary Ray Actor that’s shared via an ActorHandle with each of the submitted tasks. The Actor class essentially behaves as a counter. In this way, when one of the submitted tasks starts running, it will increment by one the counter in the Actor. Before the tasks finished, i.e. before the @ray.remote function exists, the counter is decremented by one. Below is the simplified code:
@ray.remote
def task_launcher(task_id: int, monitor: ActorHandle):

    # increment counter and set task as "running"
    monitor.task_running.remote(task_id)

    # do stuff

    # decrement counter and set task as "finished"
    monitor.task_finished.remote(task_id)
  • Having this Actor allows me to monitor how many tasks are running at a given point. This bit of information is passed to other functions in my code. (it can easily be extended to measure how many tasks have been completed and how many are still pending).

Am I re-inventing the wheel here? is this functionality already present?

Can ray.wait([...]) meet the use case?

1 Like

Good point. I was using ray.wait() before but, since it only returns the lists of “completed” and “uncompleted” objects, it requires extra logic to determine how many of the “uncompleted” are currently running. I found the approach of having an Actor more convenient… but still requires a few lines of code just to have this simple functionality working. This is why I was wondering if something simpler, already in the Ray framework exists.

Hey @jafermarq. Thanks for asking this. Your approach is definitely a working approach, and I will do the same thing if I’d like to get more detailed information about each task. You can even improve the observability by tracking of other metrics like running time or memory usage.

Unfortunately, there’s no built-in feature for task stats. We tried implementing it in the past, and the it was somewhat deprioritized. If you are interested in the built-in feature, it’ll be great if you start a feature request, so the team will see the user demand (which means we will more likely to spend resources on it)!

1 Like

plus 1 on this request, I think it will help me with the issue raised here:

1 Like