High: It blocks me to complete my task.
I am very new to ray. I just submit a function to a ray cluster around 200 times, each time with a different parameter. My code hangs after a few submits (The behavior is not always reproducible) . After SIGHUP, I notice the progress hangs at Worker._release_server.
code details:
def _release_server(self, id: bytes) → None:
if self.data_client is not None:
logger.debug(f"Releasing {id.hex()}")
self.data_client.ReleaseObject(
ray_client_pb2.ReleaseRequest(ids=[id]))
My code is very simple as follows:
@ray.remote
def get_data(date):
…
ref_jobs = [get_data.remote(date) for date in dates]
results = ray.get(ref_jobs)
get_data calls a function imported from my personal project which is uploaded through py_modules.