I am getting the following warning when the head node is started:
2025-01-22 21:20:54,695 WARNING services.py:2022 – WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 2318802944 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing ‘–shm-size=2.47gb’ to ‘docker run’ (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
Note however:
I have docker.disable_shm_size_detection unspecified - i.e. using the default value of False
Ray automatically chooses and passes --shm-size to docker - the command in monitor.out is: docker run --rm --name ray_container -d -it -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --ulimit nofile=65536:65536 --shm-size='2324098867.2000003b' --net=host 906373526063.dkr.ecr.us-east-1.amazonaws.com/gitlab-ci/hawkeye:hawkeye-jakob.leben bash
2324098867 in the docker command is more than 2318802944 in the WARNING. Ray Dashboard Resource Status report 2.18 GiB of object store memory which is 2340757176 B - more than both the previous values
What is going on? Seems like an inconsistency across multiple places in Ray.
I know I can manually override --shm-size in the docker command, but I’d rather not do that as I’d need to explicitly specify a different appropriate size for each of the many EC2 instance types that I allow autoscaler to use.