Tools for debugging Ray applications

We’d like to build better tools for debugging Ray applications. What tools would be most useful to you?

Some examples

  • Setting breakpoints in tasks/actors
  • Performance profiling for tasks/actors

What would you love to see in Ray?

2 Likes

Thanks for asking the question Robert :slight_smile:. There are various ways to debug / observe Ray.

As of Ray’s master branch (on Nov 6th), these are available tools.

Interactive debugging using pdb: https://docs.ray.io/en/master/ray-debugging.html
Monitoring support: https://docs.ray.io/en/master/ray-metrics.html
Profiling guide: https://docs.ray.io/en/master/profiling.html
Debugging guide: https://docs.ray.io/en/master/debugging.html
Dashboard: https://docs.ray.io/en/master/ray-dashboard.html (Ray’s dashboard is a UI tool that gives you visibility to the Ray cluster and its applications).

General: The recent PDB style debugging tools look amazing. Haven’t had a time to integrate them into my workflow yet, but the API is perfect.

RLlib specific: We really, really, really need a better way of smuggling logs from individual environments to the main process. The callbacks available at the moment are clunky at best and really only suited to performance timings (which would be nice to have as an independent feature). Best solution IMO would be to enable us to add arbitrary data to the “infos” structure and aggregate it in the main process between epochs.

1 Like

@sven1977 There’s feedback about rllib debugging here.

1 Like

Hey @jsuarez5341 thanks for the great suggestion, I’ll create an enhancement issue for this and we can discuss this further on git, then. …