How to trigger ray garbage collector manually

I’m trying to measure the memory usage of some operations in my application, which uses Modin, which is run through a Ray cluster on my MacBook. For measuring the memory usage, I run ray.available_resources()["object_store_memory"] before and after executing each operation. I don’t want the garbage collector to be called during one of the executing operations, which would cause wrong memory measurements. Is there any way to trigger the ray garbage collector manually, so I do it before executing each operation?

You probably know already but Ray’s object store only contains data stored via ray.put(), task args and return values. So object_store_memory does not include all memory usages of the application, e.g. Python dicts created in the application.

Ray uses distributed reference counting to manage objects in the object store, and the reference counting is tied to Python GC. So to answer your question, to delete objects in the Ray object store, you can just call del on the object reference from Python. An example:

>>> a = ray.put("zzzz")
>>> ray.available_resources()["object_store_memory"]
37699821139.0
>>> b = ray.put("yyy")
>>> ray.available_resources()["object_store_memory"]
37699821121.0
>>> del b
>>> ray.available_resources()["object_store_memory"]
37699821139.0
>>> del a
>>> ray.available_resources()["object_store_memory"]
37699821158.0
1 Like

Thank you for your help, Mingwei! I have a few more questions, I’d appreciate it if you could help me with them.

  1. Does the garbage collection always happen immediately after calling del on the object?

  2. If an object has references to “many” other objects and we delete that object, all those objects referenced by it, get deleted altogether immediately (assuming no other object references to them)?

  3. Do you know any resource to recommend to study more about distributed reference counting?

Glad to help!

  1. After calling del on the object, the local reference count of the underlying object is decremented synchronously. If the local reference count of the object becomes zero, and the object is owned by another node, this information is propagated asynchronously to the owner.

  2. Yes, reference counting is done for nested ObjectRefs as well. Similar to above, local reference counting is done synchronously with del. Remote reference counting information is propagated asynchronously.

  3. Objects — Ray 3.0.0.dev0 and Memory Management — Ray 1.12.1 have high level information for the distributed reference counting. More details are available at https://stephanie-wang.github.io/pdfs/nsdi21-ownership.pdf

Thank you very much again, Mingwei! Those resources are also very helpful.

Just to confirm, lets’ say, we have a reference to a big dataset. We have one reference to that dataset and we del that. The garbage collection happens synchronously, correct? or it happens sometime after returning from the del command?