How can I change the raylet object spilling messages directory? By default, it stores in “/tmp/ray/…”.
I am using this code piece I found in ray.
if not ray.is_initialized():
ray.init(configure_logging=True,
ignore_reinit_error=True,
_system_config={
"object_spilling_config": json.dumps(
{"type": "filesystem", "params": {"directory_path": "raylet_results"}},
)
},
)
I want to store it in “raylet_results” folder. But this code is not working. It still stores its result in “/tmp/ray/…”? How can I not make it store in “/tmp/ray…”
Hmm I tried out your code locally and it seems to work for me. It could be that you are already connected to a Ray instance that has the default spilling directory set already, so the new _system_config that you’re passing does not apply. Here are two things to check:
Did you run ray start earlier on this node? If so, then ray.init() will connect to this Ray instance, which is already pointing to the default spill directory. To set the directory at ray start time instead, you can try a command like one of these (either should work):
RAY_object_spilling_config="{\"type\": \"filesystem\", \"params\": {\"directory_path\": \"raylet_results\"}}" ray start --head
ray start --head --system-config='{"object_spilling_config": "{\"type\": \"filesystem\", \"params\": {\"directory_path\": \"raylet_results\"}}"}'
Try removing ignore_reinit_error? My guess here is that this is hiding the error you would normally get when passing _system_config to an already-started Ray instance.
@Stephanie_Wang I am running the code on remote ec2 instance, I have removed ignore_reinit_error and I am doing ray.shutdown() before I am doing ray.init(). But still it is storing in /tmp/ray/.... I even set the whole path.
One weird thing is happening it says in raylet log
2022-12-02 06:37:18,832 INFO worker.py:1519 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265
Ray
Python version: 3.10.6
Ray version: 2.1.0
Dashboard: http://127.0.0.1:8265
(raylet) [2022-12-02 06:37:27,825 E 129150 129197] (raylet) file_system_monitor.cc:105: /home/pritham1/dnn_automation/raylet_results is over 95% full, available space: 261623808; capacity: 8132173824. Object creation will fail if spilling is required.
It says /home/pritham1/dnn_automation/raylet_results is over 95% full, even though it is storing in /tmp/ray
Secondly is there any way to do ray start as you did as a command, but doing it in a python script? Or is it automatically start when calling ray.init()? I don’t want to use the CLI.