How can I configure ray to never run out of memory?

I’m using ray as a backend with Modin’s out of core feature. Unfortunately I still see memory error with message it cannot allocate memory to object store. I realized that I’m using a lot of numpy arrays that are using memory but for some reason they aren’t being spilled to disk.

So I think what might be happening is I have a mix of ray code (Modin) and non-ray code numpy and the additional numpy code increases memory pressure and causes ray to run out of memory / not be able to allocate any new objects to the object store at some point.

Is there a way to configure ray to always spill to external storage or disk so the program doesn’t run out of memory?

Digging into documentation I found I could possibly it to s3 bucket but I’d like to know if this is possible or am on the right track.

ray.init(
_system_config={
“max_io_workers”: 4, # More IO workers for remote storage.
“min_spilling_size”: 100 * 1024 * 1024, # Spill at least 100MB at a time.
“object_spilling_config”: json.dumps(
{“type”: “smart_open”, “params”: {“uri”: “s3:///bucket/path”}},
)
},
)

https://docs.ray.io/en/master/memory-management.html#memory-aware-scheduling

Hi, the disk spilling is more recommended than S3 spilling right now (we need more performance improvement in S3 spilling). You can tried the disk spilling here instead. Memory Management — Ray v2.0.0.dev0

Also, are you in Ray’s public slack channel?

I’m not on public slack no.

I’m still getting ‘cant add object to object store’ errors even with disk spill over is there anything else I can try?

Hey @devin-petersohn do you know what’s the best practice in Modin now (before we start testing Modin with object spilling)?

Also, @John_Smith would you like to join the public slack and have 1:1 meeting with me? I’d love to see what’s the issue and help you unblocked.

Also, python - Ray object store running out of memory using out of core. How can I configure an external object store like s3 bucket? - Stack Overflow for more detail about object spilling.

The current best practice for using Modin (installed from github master) would be to initialize Ray with a large plasma store and have the plasma directory point to disk. That would ensure the object store is larger than memory and then the operating system would page in objects. It’s not as efficient as it could be, but at worst I’ve observed 50-60% slower than pure in-memory performance (despite the 10x overhead of going to disk from memory). Usually this is worth it.

1 Like