Ability to connect to multiple Ray clusters

Is it possible to connect to multiple Ray clusters from the same notebook?

Thanks for opening this question @devin-petersohn!

Currently this isn’t possible, but Barak (who I’ll add to the discourse) is working on a related feature (Ray Client).

Just for reference, can you post a bit about your requirements/use case?

One of our common use cases is to debug issues happening in a “production” environment on a staging area or test cluster. We would like to enable the user to connect to both clusters from the same notebook and interleave executions. Does this make sense?

Some questions

  1. Is it critical to be connected to both clusters at the same time or would it be ok to connect to one, then disconnect, then connect to the other, etc.
  2. Do you ever pass object refs created by a task on one cluster into a task on the other cluster?

How does the debugging scenario that you described work? Is it that you run the same code on both clusters and look for differences in the behavior?

1 Like

No, connection to only one cluster at a time is necessary. The connection needs to be alive long enough to submit tasks, but can be closed afterward.

No, at least not at first.

The user will debug in the staging cluster with a subset of the data, then submit a set of tasks to a production cluster for the full dataset after they have sufficiently debugged the query.

Thanks, this is helpful!

1 Like

Yeah, I’ll be sure to circle back once we enable this (this is the Ray Client feature on the github issues) – it sounds like it will be entirely doable :slight_smile:

1 Like