Intercept Code Execution on the Cluster Side

How severe does this issue affect your experience of using Ray?

  • LOW

One of Ray’s capabilities that we are fond of is that it allows a caller to write arbitrary code and execute it as-is on a more powerful cluster without much intervention. The issue however is that we would like to be able to augment the call-stack such that we can inject (or wrap) the calls to our needs.

For example - let’s say we want to allow the user to load data into a dataframe but ensure that before the user returns the data, we intercept the call, authenticate the user and apply a filter to the data before returning the results back to the caller.

Other frameworks, via middlewhere injection, allows engineers to do this and I’m curious if Ray (or will Ray) support this capability. It would be huge win for us from an engineering standpoint because today we have to route the calls through a proxy server and we have since disabled direct ray cluster access for users as a result.

Thanks!

To be clear, you’d like to add a callback across the same job right? Something like Diagnostics (local) — Dask documentation

Do you mind creating a feature request to Github?

Yes! - this is very similiar to what we could utilize.

If we could get a handle on the execution of the job, we could in theory augment it to our needs. For example - identify the caller.

Will work on the feature request.

1 Like