Retry exception

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I managed to create to a remote raykube cluster on GKE. I’m forwarding the 10001 port but I’m getting this error on my client:

Caught schedule exception
Traceback (most recent call last):
  File "/home/joao/PycharmProjects/learningOrchestra2/t.py", line 89, in <module>
    ray.init("ray://127.0.0.1:10001")
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/worker.py", line 882, in init
    return builder.connect()
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/client_builder.py", line 167, in connect
    dashboard_url = ray.get(get_dashboard_url.options(num_cpus=0).remote())
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 104, in wrapper
    return getattr(ray, func.__name__)(*args, **kwargs)
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/util/client/api.py", line 43, in get
    return self.worker.get(vals, timeout=timeout)
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/util/client/worker.py", line 433, in get
    res = self._get(to_get, op_timeout)
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/util/client/worker.py", line 450, in _get
    req = ray_client_pb2.GetRequest(ids=[r.id for r in ref], timeout=timeout)
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/util/client/worker.py", line 450, in <listcomp>
    req = ray_client_pb2.GetRequest(ids=[r.id for r in ref], timeout=timeout)
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/util/client/common.py", line 141, in id
    return self.binary()
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/util/client/common.py", line 120, in binary
    self._wait_for_id()
  File "/home/joao/PycharmProjects/learningOrchestra2/venv3/lib/python3.7/site-packages/ray/util/client/common.py", line 197, in _wait_for_id
    self._set_id(self._id_future.result(timeout=timeout))
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 435, in result
    return self.__get_result()
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
TypeError: The type of keyword 'retry_exceptions' must be <class 'bool'>, but received type <class 'NoneType'>

Also, is it possible to forward that connection to public IP address of the cluster?

1 Like

Solved with updating the client to 1.13

1 Like