Here’s a code example that demonstrates how to safely manage ObjectRefs with a long-lived actor (PlasmaStoreActor) to avoid ReferenceCountingAssertionError when using max_calls or short-lived workers. The key is to store the ObjectRef in the actor before the worker exits, ensuring the object’s lifetime is tied to the actor, not the ephemeral worker (GitHub issue 18456, Ray Discourse):
import ray
import numpy as np
# Long-lived actor to own and store ObjectRefs
@ray.remote
class PlasmaStoreActor:
def __init__(self):
self.refs = {}
def store(self, key, obj_ref):
self.refs[key] = obj_ref
def retrieve(self, key):
return self.refs.get(key)
# Start Ray and the actor
ray.init()
store_actor = PlasmaStoreActor.options(name="PlasmaStoreActor", lifetime="detached", get_if_exists=True).remote()
# Remote function with max_calls that puts object and stores ref in actor
@ray.remote(max_calls=2)
def generate_and_store_array(key, size):
arr = np.random.random(size)
# Put the array in the object store, owned by the actor
obj_ref = ray.put(arr, _owner=store_actor)
# Store the ObjectRef in the actor before the worker exits
ray.get(store_actor.store.remote(key, obj_ref))
return key
# Usage
key = "my_array"
size = (100, 100)
ray.get(generate_and_store_array.remote(key, size))
# Retrieve the ObjectRef from the actor and get the data
obj_ref = ray.get(store_actor.retrieve.remote(key))
arr = ray.get(obj_ref)
print(arr.shape) # (100, 100)
This pattern ensures the ObjectRef is not lost when the worker exits, preventing ReferenceCountingAssertionError.
Sources:
- https://discuss.ray.io/t/ray-serve-sharing-objects-with-deployment/5243
- https://github.com/ray-project/ray/issues/53727
Hint: Mention @RunLLM in the post for followups.