Memory not released to default levels: `ray::IDLE` Processes Not Released**

The most common cause of Ray object store (MMAP_SHM) memory “leakage” is that ObjectRefs to large objects (like numpy arrays or big lists) remain in scope somewhere in your application, preventing Ray from evicting or freeing them. This can happen if references are held in Python variables, returned in dictionaries, or passed between Serve deployments without being dereferenced or deleted. In your pattern, returning ObjectRefs in a dict and then awaiting them in another module can easily lead to lingering references if not all are dereferenced or deleted promptly (Ray memory management docs, Ray memory debugging guide).

Additionally, if you pass large objects by value (not by reference) or repeatedly create new ObjectRefs without cleaning up old ones, the object store will fill up and may not release memory even after tasks complete. This is especially true in long-running Serve deployments, where idle workers or lingering references can pin objects in memory. Use ray memory to check for in-scope ObjectRefs, and ensure you delete or dereference objects as soon as they are no longer needed (Ray Discourse: object store memory issues, Ray memory debugging guide).

Would you like a step-by-step guide to track down which references are causing the leak?

Sources:

Hint: Mention @RunLLM in the post for followups.