Writing directly into ray shared memory

Medium: Significantly affects my productivity but can find a workaround.

i am guessing this isn’t an option but i’ll try anyway

i am creating a large buffer as output of my ray task. Then when i give it as a return value, ray writes it to the plasma shared memory. So that is another memory write of the entire buffer.

if i was to receive a pointer to the shared memory directly i could write to it and avoid the extra copy

is this at all possibe?

Currently, Ray does not support direct write access to the plasma shared memory buffer from within a task; when you return a large object (like a numpy array), Ray copies the data into the object store, resulting in an extra memory write. Zero-copy is only supported for reading numpy arrays from the object store (i.e., when you ray.get), not for writing—so you cannot obtain a pointer to the shared memory to write directly and avoid the copy at task return time. This limitation is acknowledged and tracked as a potential enhancement in the Ray project, but as of now, only zero-copy reads are supported, not zero-copy writes or in-place writes to plasma buffers by user code. For true shared-memory, in-place writes, you would need to use custom shared memory constructs outside of Ray’s object store, such as Python’s multiprocessing.shared_memory, but this is not natively integrated with Ray’s object lifecycle or distributed features. See discussion and workarounds in the linked issue and forum posts for more details: Ray GitHub Issue #48961, Ray Discourse: How to share memory with non-numpy object.

Would you like more detail or example code for possible workarounds?

Sources:

Hint: Mention @RunLLM in the post for followups.