Hi,
I have a running Ray cluster on a Kubernetes cluster, starting a client works, but I have a strange issue when creating backend (example from documentation - Key Concepts — Ray v1.1.0) :
>>> client.create_backend("simple_backend_class", RequestHandler, "hello, world!")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/api.py", line 31, in check
return f(self, *args, **kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/api.py", line 295, in create_backend
replica_config))
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 1379, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RayServeException): ray::ServeController.create_backend() (pid=75, ip=10.244.1.242)
File "python/ray/_raylet.pyx", line 463, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 412, in ray._raylet.execute_task.function_executor
File "python/ray/_raylet.pyx", line 1501, in ray._raylet.CoreWorker.run_async_func_in_event_loop
File "/home/ray/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 428, in result
return self.__get_result()
File "/home/ray/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/controller.py", line 836, in create_backend
raise e
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/controller.py", line 833, in create_backend
backend_config.num_replicas)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/controller.py", line 282, in _scale_backend_replicas
num_possible, current_num_replicas + num_possible))
ray.serve.exceptions.RayServeException: Cannot scale backend simple_backend_class to 1 replicas. Ray Serve tried to add 1 replicas but the resources only allows 0 to be added. To fix this, consider scaling to replica to 0 or add more resources to the cluster. You can check avaiable resources with ray.nodes().
I connected to Ray cluster like this:
if __name__ == "__main__":
if ("RAY_HEAD_SERVICE_HOST" not in os.environ
or os.environ["RAY_HEAD_SERVICE_HOST"] == ""):
raise ValueError("RAY_HEAD_SERVICE_HOST environment variable empty."
"Is there a ray cluster running?")
redis_host = os.environ["RAY_HEAD_SERVICE_HOST"]
ray.init(address=redis_host + ":6379")
#backend_config = serve.BackendConfig(num_replicas=1)
#client = serve.start(detached=True, http_host="0.0.0.0")
client = serve.connect()
I saw this post, but didn’t find any more info of what could it possibly mean:
Kind regards,
Karlo