I tried to run a program that requires large-size ray object storage, on a cluster machine that has about 260GB memory left. It only allocated 90GB for object storage. When I tried to use “object_store_memory” parameter in ray.init(), it informed me that it is forbidden to allocate object store memory by yourself on cluster. Why? How can I fully utilize 260GB memory to store objects as much as possible?
WARNING:tensorflow:From /root/anaconda3/lib/python3.7/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2020-11-26 08:31:17,726 INFO worker.py:651 – Connecting to existing Ray cluster at address: 192.168.137.151:6379
Traceback (most recent call last):
File “./buffer_test.py”, line 105, in
File “/root/anaconda3/lib/python3.7/site-packages/ray/worker.py”, line 728, in init
raise ValueError("When connecting to an existing cluster, "
ValueError: When connecting to an existing cluster, object_store_memory must not be provided.