Sample ray program does not work on kubernetes with ray1.4.0 branch

Running sample gives me the below error:
$ python ray/doc/kubernetes/example_scripts/run_local_example.py
/home/asmalvan/anaconda3/envs/ray/lib/python3.7/site-packages/ray/autoscaler/_private/cli_logger.py:61: FutureWarning: Not all Ray CLI dependencies were found. In Ray 1.4+, the Ray CLI, autoscaler, and dashboard will only be usable via pip install 'ray[default]'. Please update your install command.
“update your install command.”, FutureWarning)
Iteration 0
Traceback (most recent call last):
File “ray/doc/kubernetes/example_scripts/run_local_example.py”, line 58, in
main()
File “ray/doc/kubernetes/example_scripts/run_local_example.py”, line 49, in main
print(Counter(ray.get(results)))
File “/home/asmalvan/anaconda3/envs/ray/lib/python3.7/site-packages/ray/_private/client_mode_hook.py”, line 61, in wrapper
return getattr(ray, func.name)(*args, **kwargs)
File “/home/asmalvan/anaconda3/envs/ray/lib/python3.7/site-packages/ray/util/client/api.py”, line 42, in get
return self.worker.get(vals, timeout=timeout)
File “/home/asmalvan/anaconda3/envs/ray/lib/python3.7/site-packages/ray/util/client/worker.py”, line 202, in get
res = self._get(obj_ref, op_timeout)
File “/home/asmalvan/anaconda3/envs/ray/lib/python3.7/site-packages/ray/util/client/worker.py”, line 225, in _get
raise err
types.RayTaskError(RayOutOfMemoryError): ray::gethostname() (pid=412, ip=172.17.0.4)
File “python/ray/_raylet.pyx”, line 460, in ray._raylet.execute_task
File “python/ray/_raylet.pyx”, line 481, in ray._raylet.execute_task
File “python/ray/_raylet.pyx”, line 351, in ray._raylet.raise_if_dependency_failed
ray.exceptions.RayTaskError: ray::gethostname() (pid=412, ip=172.17.0.4)
File “python/ray/_raylet.pyx”, line 458, in ray._raylet.execute_task
File “/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/memory_monitor.py”, line 141, in raise_if_low_memory
self.error_threshold))
ray._private.memory_monitor.RayOutOfMemoryError: More than 95% of the memory on node example-cluster-ray-head-type-kjr92 is used (0.49 / 0.5 GB). The top 10 memory consumers are:

PID MEM COMMAND
133 0.08GiB /home/ray/anaconda3/bin/python -u /home/ray/anaconda3/lib/python3.7/site-packages/ray/new_dashboard/
354 0.08GiB /home/ray/anaconda3/bin/python -m ray.util.client.server --redis-address=172.17.0.4:6379 --port=2300
185 0.07GiB /home/ray/anaconda3/bin/python -u /home/ray/anaconda3/lib/python3.7/site-packages/ray/new_dashboard/
117 0.05GiB /home/ray/anaconda3/bin/python -m ray.util.client.server --redis-address=172.17.0.4:6379 --port=1000
412 0.05GiB ray::IDLE
151 0.04GiB /home/ray/anaconda3/bin/python -u /home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/log_m
116 0.01GiB /home/ray/anaconda3/lib/python3.7/site-packages/ray/core/src/ray/gcs/gcs_server --redis_address=172.
150 0.01GiB /home/ray/anaconda3/lib/python3.7/site-packages/ray/core/src/ray/raylet/raylet --raylet_socket_name=
106 0.01GiB /home/ray/anaconda3/lib/python3.7/site-packages/ray/core/src/ray/thirdparty/redis/src/redis-server *
111 0.01GiB /home/ray/anaconda3/lib/python3.7/site-packages/ray/core/src/ray/thirdparty/redis/src/redis-server *

Should the resources in the example-cluster.yaml file be updated ?

I increased the worker and head node memory to 2G in file /home/asm/code/ray140/ray/deploy/charts/ray/values.yaml and it works now

1 Like