Mid-way through a job, ray.util.queue.Queue.get() reports "Ray has not been started yet"

I am getting this unexpected error mid-way through the my Ray job.

I start a job with 5097 tasks and do many successful Queue.get() calls. About 15% of the way through the job, Queue.get() all of a sudden complains that Ray has not been started yet.

Any suggestions on how to debug?

Traceback (most recent call last):
  File "/home/djakubiec/var/jupyter/marketNeutralEquityStrategy/test.build/mnesRanking.py", line 1765, in <module>
    rayClient.processJob(parameters.priority, [(parametersReference, tickerRegion) for tickerRegion in tickerRegions], carryForwardRequestHandler, carryForwardCompletionHandler)
  File "/ceph/var/ray/tools/python.modules/focusvq/raytools/RayClient.py", line 77, in processJob
    threadsAvailable = threadQueue.get()
  File "/ceph/var/users/djakubiec/anaconda3/envs/ray2/lib/python3.7/site-packages/ray/util/queue.py", line 155, in get
    return ray.get(self.actor.get.remote(timeout))
  File "/ceph/var/users/djakubiec/anaconda3/envs/ray2/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 62, in wrapper
    return func(*args, **kwargs)
  File "/ceph/var/users/djakubiec/anaconda3/envs/ray2/lib/python3.7/site-packages/ray/worker.py", line 1471, in get
    worker.check_connected()
  File "/ceph/var/users/djakubiec/anaconda3/envs/ray2/lib/python3.7/site-packages/ray/worker.py", line 218, in check_connected
    raise RaySystemError("Ray has not been started yet. You can "
ray.exceptions.RaySystemError: System error: Ray has not been started yet. You can start Ray with 'ray.init()'.

Do you mind posting a zip of /tmp/ray/session_latest/logs?

This seems like a client error @ijrsvt

Sure.

Sorry for the n00b question: but how to post the zip here? When I try to copy it here I get:

Hmm, maybe you could post a google drive zip? Or you could also create a github issue!

Thank @rliaw, I opened up a GitHub issue.