I have a K8s cluster running and trying to deploy rayproject/ray-ml:nightly nodes for both head and workers. Currently when I run ray monitor cluster.yaml
I’m seeing the following stack trace after ray up
completes
2021-04-01 19:28:44,147 ERROR monitor.py:245 -- Error in monitor loop
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/monitor.py", line 276, in run
self._initialize_autoscaler()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/monitor.py", line 125, in _initialize_autoscaler
event_summarizer=self.event_summarizer)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 86, in __init__
self.reset(errors_fatal=True)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 521, in reset
raise e
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 479, in reset
self.config["cluster_name"])
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/providers.py", line 186, in _get_node_provider
provider_cls = _get_node_provider_cls(provider_config)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/providers.py", line 162, in _get_node_provider_cls
return importer(provider_config)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/providers.py", line 54, in _import_kubernetes
from ray.autoscaler._private.kubernetes.node_provider import \
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/__init__.py", line 1, in <module>
import kubernetes
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/__init__.py", line 2, in <module>
from kubernetes.config.config_exception import ConfigException
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/config.py", line 6, in <module>
from kubernetes import client
ImportError: cannot import name 'client' from 'kubernetes' (/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/__init__.py)
I’ve connected to the container, run python and was able to import both kubernetes and able to run from kubernetes import client
, so I’m fresh out of options what to do next.
The installed kubernetes version is
>>> kubernetes.__version__
'12.0.1'
If I drop back to rayproject/ray-ml:1.3.0
I don’t see this error, but I run into different rllib errors.