After a few minutes, the program aborted with the following error:
2021-03-16 13:55:05,471 ERROR worker.py:936 -- print_logs: Connection closed by server.
2021-03-16 13:55:05,472 ERROR import_thread.py:88 -- ImportThread: Connection closed by server.
Aborted
And I relogged in the head node, and tried ray status, got another error
ray status
======== Autoscaler status: 2021-03-16 14:16:35.705243 ========
Node status
---------------------------------------------------------------
Healthy:
1 head-node
2 worker-node
Pending:
(no pending nodes)
Recent failures:
(no failures)
Resources
---------------------------------------------------------------
Usage:
0.0/3.0 CPU
0.0/2.0 bar
0.0/2.0 foo
0.00/1.904 GiB memory
0.00/0.857 GiB object_store_memory
Demands:
(no resource demands)
The autoscaler failed with the following error:
Terminated with signal 15
File "/home/ray/anaconda3/bin/ray-operator", line 8, in <module>
sys.exit(main())
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/ray_operator/operator.py", line 154, in main
handle_event(event_type, cluster_cr, cluster_name)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/ray_operator/operator.py", line 113, in handle_event
cluster_action(event_type, cluster_cr, cluster_name)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/ray_operator/operator.py", line 127, in cluster_action
ray_clusters[cluster_name].create_or_update()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/ray_operator/operator.py", line 50, in create_or_update
self.do_in_subprocess(self._create_or_update)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/ray_operator/operator.py", line 40, in do_in_subprocess
self.subprocess.start()
File "/home/ray/anaconda3/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/home/ray/anaconda3/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/ray/anaconda3/lib/python3.7/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/home/ray/anaconda3/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/home/ray/anaconda3/lib/python3.7/multiprocessing/popen_fork.py", line 74, in _launch
code = process_obj._bootstrap()
File "/home/ray/anaconda3/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ray/anaconda3/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/ray_operator/operator.py", line 54, in _create_or_update
self.start_monitor()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/ray_operator/operator.py", line 79, in start_monitor
self.mtr.run()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/monitor.py", line 274, in run
self._run()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/monitor.py", line 177, in _run
self.autoscaler.update()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 135, in update
self._update()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 186, in _update
if (self._keep_min_worker_of_node_type(node_id, node_type_counts)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/autoscaler.py", line 414, in _keep_min_worker_of_node_type
tags = self.provider.node_tags(node_id)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/kubernetes/node_provider.py", line 64, in node_tags
pod = core_api().read_namespaced_pod(node_id, self.namespace)
File "/home/ray/anaconda3/lib/python3.7/site-packages/kubernetes/client/api/core_v1_api.py", line 22785, in read_namespaced_pod
return self.read_namespaced_pod_with_http_info(name, namespace, **kwargs) # noqa: E501
File "/home/ray/anaconda3/lib/python3.7/site-packages/kubernetes/client/api/core_v1_api.py", line 22894, in read_namespaced_pod_with_http_info
collection_formats=collection_formats)
File "/home/ray/anaconda3/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 353, in call_api
_preload_content, _request_timeout, _host)
File "/home/ray/anaconda3/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 184, in __call_api
_request_timeout=_request_timeout)
File "/home/ray/anaconda3/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 377, in request
headers=headers)
File "/home/ray/anaconda3/lib/python3.7/site-packages/kubernetes/client/rest.py", line 243, in GET
query_params=query_params)
File "/home/ray/anaconda3/lib/python3.7/site-packages/kubernetes/client/rest.py", line 216, in request
headers=headers)
File "/home/ray/anaconda3/lib/python3.7/site-packages/urllib3/request.py", line 76, in request
method, url, fields=fields, headers=headers, **urlopen_kw
File "/home/ray/anaconda3/lib/python3.7/site-packages/urllib3/request.py", line 97, in request_encode_url
return self.urlopen(method, url, **extra_kw)
File "/home/ray/anaconda3/lib/python3.7/site-packages/urllib3/poolmanager.py", line 336, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "/home/ray/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
chunked=chunked,
File "/home/ray/anaconda3/lib/python3.7/site-packages/urllib3/connectionpool.py", line 421, in _make_request
httplib_response = conn.getresponse()
File "/home/ray/anaconda3/lib/python3.7/http/client.py", line 1344, in getresponse
response.begin()
File "/home/ray/anaconda3/lib/python3.7/http/client.py", line 306, in begin
version, status, reason = self._read_status()
File "/home/ray/anaconda3/lib/python3.7/http/client.py", line 267, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/home/ray/anaconda3/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
File "/home/ray/anaconda3/lib/python3.7/ssl.py", line 1071, in recv_into
return self.read(nbytes, buffer)
File "/home/ray/anaconda3/lib/python3.7/ssl.py", line 929, in read
return self._sslobj.read(len, buffer)