Getting reference counting assertation error when storing ObjectRefs in class variables

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

The following code results in a ReferenceCountingAssertationError:

import ray
import time

# dummy actor to own objects after test_put process is killed
class global_actor():
    def wake(self):

# wake the actor
actor = global_actor.remote()

class test_put():
    def __init__(self):
        self.putted = ray.put(123, _owner=actor)
    def get(self):
        return self.putted
    def print(self):

test = test_put.remote()
t_get = ray.get(test.get.remote())
# allow the actor to exit and terminate
del test

Interestingly, this does not happen if I construct the ObjectRef inside test_put.get() instead of the constructor. However for my purposes it is necessary for the actor to maintain a local ref to the ObjectRef and must be available after construction. Any clues as to what is happening?

Thanks in advance.


Upon further experimentation, the above issue has to do with the assignment of an ObjectRef to a class variable not incrementing the internal reference counter. It seems to work when assigning into a dict or list, but not an object. From the official documentation I thought that I could nest references in objects, but it seems to only work with lists or dicts. Is this the intended behaviour? Thanks.

Hey @bsun , welcome to Ray community and thanks for posting questions with a reproducible script.

As for your usecase, if you want to keep the object after your original test_put actor is deleted, how about you have you global_actor storing the object references explicitly? I am not 100% sure about the reference counting protocol here, but looking at _owner doc here, it seems to me setting it at ray.put is not enough to make this indirect ownership work.

I think this seems to be working, but not sure if there are any other requirements you have:

import ray
import time

# dummy actor to own objects after test_put process is killed
class global_actor():
    def wake(self):

    def store_obj(self, obj):
        self.obj = obj

    def get_obj(self):
        return self.obj

# wake the actor
actor = global_actor.remote()

class test_put:
    def __init__(self, owner):
        self.putted = ray.put(123)

    def get(self):
        return self.putted

    def print(self):

test = test_put.remote(owner=actor)
t_get = ray.get(test.get.remote())

# allow the actor to exit and terminate
del test



As for

the above issue has to do with the assignment of an ObjectRef to a class variable not incrementing the internal reference counter

cc @Stephanie_Wang whom should knows more about the reference counting protocol than I do.

+1 to @rickyyx’s answer. Hmm not sure what you mean by

assignment of an ObjectRef to a class variable

can you explain this part a bit more?

By the way, I would encourage you to avoid using ray.put(_owner) since this API is experimental and operations like ref counting may not work as expected.

Thanks for the workaround! I seem to have missed the tidbit about needing to pass the owner a reference to the object, and your solution does indeed fix that. I have grossly oversimplified my use case here so I’m not sure it will be directly applicable, but this is a great start for now.

Thanks for the info! I’ll see if there’s an alternative workaround that does not require us to use the _owner kwarg.

As for the reference counting error, my test_put class breaks with the following code (with everything else held the same):

class test_put():
    def __init__(self):
    def get(self):
        self.putted = ray.put(123, _owner=actor)
        return self.putted

But even with the _owner assignment, the alternative code keeps the reference alive in the drive code:

class test_put():
    def __init__(self):
    def get(self):
        # store in a local variable, instead as a class/instance variable in self.__dict__
        temp = ray.put(123, _owner=actor)
        temp2 = [ray.put(456, _owner=actor)
        temp3 = {'test': ray.put(789, _owner=actor)
        self.temp = temp
        self.temp1 = temp1
        self.temp2 = temp2
        return temp # returning either temp2 or temp3 also keeps the reference alive
        # return self.temp
        # returning self.temp/1/2 will not keep the reference alive and throw the assertion error

In either case, I do not pass the ObjectRef to the owner. However, specifically in the case where I assign the ObjectRef to a class/instance variable, then return that specific self.* variable I seem to get the reference count error.

With respect to @rickyyx solution, I’ll look to an alternative solution along those lines and will try to refrain from using _owner for now. Thanks for everyone’s help!

Hmm I see, thanks! I’m not able to reproduce the same behavior, though, and I don’t think it should matter how you pass back the variable. I’d guess that the root cause of the error is a race condition, and maybe the race happened to correlate with what you were seeing.