I’m trying to measure the memory usage of some operations in my application, which uses Modin, which is run through a Ray cluster on my MacBook. For measuring the memory usage, I run ray.available_resources()["object_store_memory"]
before and after executing each operation. I don’t want the garbage collector to be called during one of the executing operations, which would cause wrong memory measurements. Is there any way to trigger the ray garbage collector manually, so I do it before executing each operation?
You probably know already but Ray’s object store only contains data stored via ray.put()
, task args and return values. So object_store_memory
does not include all memory usages of the application, e.g. Python dicts created in the application.
Ray uses distributed reference counting to manage objects in the object store, and the reference counting is tied to Python GC. So to answer your question, to delete objects in the Ray object store, you can just call del
on the object reference from Python. An example:
>>> a = ray.put("zzzz")
>>> ray.available_resources()["object_store_memory"]
37699821139.0
>>> b = ray.put("yyy")
>>> ray.available_resources()["object_store_memory"]
37699821121.0
>>> del b
>>> ray.available_resources()["object_store_memory"]
37699821139.0
>>> del a
>>> ray.available_resources()["object_store_memory"]
37699821158.0
Thank you for your help, Mingwei! I have a few more questions, I’d appreciate it if you could help me with them.
-
Does the garbage collection always happen immediately after calling del on the object?
-
If an object has references to “many” other objects and we delete that object, all those objects referenced by it, get deleted altogether immediately (assuming no other object references to them)?
-
Do you know any resource to recommend to study more about distributed reference counting?
Glad to help!
-
After calling
del
on the object, the local reference count of the underlying object is decremented synchronously. If the local reference count of the object becomes zero, and the object is owned by another node, this information is propagated asynchronously to the owner. -
Yes, reference counting is done for nested ObjectRefs as well. Similar to above, local reference counting is done synchronously with
del
. Remote reference counting information is propagated asynchronously. -
Objects — Ray 3.0.0.dev0 and Memory Management — Ray 1.12.1 have high level information for the distributed reference counting. More details are available at https://stephanie-wang.github.io/pdfs/nsdi21-ownership.pdf
Thank you very much again, Mingwei! Those resources are also very helpful.
Just to confirm, lets’ say, we have a reference to a big dataset. We have one reference to that dataset and we del that. The garbage collection happens synchronously, correct? or it happens sometime after returning from the del command?