Docker. Using /tmp instead of /dev/shm because /dev/shm has only 31457280000 bytes available

Hi again, so, yes only 31Gb available (as set by --shm-size=Xgb). Any ideas? Running on GCP

WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 31457280000 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing ‘–shm-size=Xgb’ to ‘docker run’ (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 2gb.

I think this might be because Ray is mis-detecting the available memory trying to set the object store size to > 31GiB. Can you try setting the object store size explicitly to less than that? e.g.,

ray start --object-store-memory=10000000000 or ray.init(object_store_memory=10000000000)

Hi, thanks for helping out.
this setup results in this error:

ray.init( num_cpus = self.tuner_params[‘cpus’],
num_gpus = self.tuner_params[‘gpus’],
_memory = 38 x1024 x 10e6,
object_store_memory =7 x 1024 x 10e6 )

and Docker run

docker run --shm-size=60G cogad3

Docker file:

FROM tensorflow/tensorflow:nightly

RUN apt-get update && apt-get install --no-install-recommends --no-install-suggests -y curl
RUN apt-get install unzip
RUN apt-get -y install rsync grsync
RUN apt-get -y install python3
RUN apt-get -y install python3-pip
RUN pip install --upgrade pip

COPY . .
RUN pip install -r requirements.txt

EXPOSE 5000

ENTRYPOINT [“python3”,“cogad3.py”]

cc @ijrsvt Maybe the same issue as the one we found before; Can you check?

I am using GCP as well, and I get the same error message. Is it possible to configure ray to use another folder entirely for shared memory? E.g. I want to use the folder $HOME$/NFS/dev/shared?

I am working in a system where the machine nodes have limited amounts of memory (~10GB max) and the users have very small personal harddisks (~5GB). Therefore I would like for ray to use the mounted NFS drive for it’s shared memory, is this possible?

Hi @cama,

Sorry for the late reply. You can use the --plasma-directory option for that purpose. (e.g. ray start --plasma-directory=NFS/dev/shared)

Hello jjyao!

Thank you for the answer! However, I am right now using the Pool function, is there a way to pass in the plasma_directory argument or the object_store_memory to the pool function call?

from ray.multiprocessing import Pool
with Pool(4) as p: 
    # Do Something