Queue difference between JobSubmission vs normal ray job from ray init

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I know ray has implicitly use queue to schedule jobs. Does the queue perform differently if a) I use JobSubmission to submit all my jobs to a remote ray cluster or b) I use ray.init to connect to the remote ray cluster and then do ray.get([all remote jobs])? We implemented the backpressure in here Pattern: Using ray.wait to limit the number of pending tasks — Ray 2.1.0

We have observed crash in scenario b) and a) is fine.

In Ray, job and tasks are different concepts. Job typically indicates a Python script that runs ray.init()

Both job submission API or ray.init() just creates a driver from the cluster. Tasks are submitted and queued in the same way for both cases.

Re; crash: Not sure why you have the crash in this case… it’d be great if you can give us a repro script!