Remote function parameter object handling

I am new to Ray and am seeing some behavior I didn’t expect.

I would like to understand how Ray handles passing obj references to remote functions.

Say I have a class “E”, which I instantiate as ‘obj’.

I pass obj to a remote function ‘func_1’, which does something, but it is (unexpectedly) doing it on a copy of the original obj.

I expected each remote Ray function call to operate on the passed object reference directly, but it is creating a new object and operating on that instead.

Is there a way to get the remote function to act on the original object reference and not a copy?

@ray.remote
def func_1(obj): #obj is an class obj
    ref = obj.doSomething()
    return obj

class E(object):
    def init(self):
        return

    def doSomething(self):
        return

def main():
    objs = []
    future = []
    for _ in range(5):
        obj = E()
        objs.append(obj)
        print("Original: " + str(obj))
        future.append(func_1.remote(obj))
    
    obj2 = ray.get(future)
    [print("New:  " + str(o)) for o in obj2]
    
    return

if __name__ == "__main__":    
    main()

==================================
Output:

Original: <main.E object at 0x7f26cc111ca0>
Original: <main.E object at 0x7f26bc13d730>
Original: <main.E object at 0x7f26cc0e2c70>
Original: <main.E object at 0x7f269c5a5520>
Original: <main.E object at 0x7f26bc100520>
New: <main.E object at 0x7f26bc100640>
New: <main.E object at 0x7f26bc13deb0>
New: <main.E object at 0x7f26bc100460>
New: <main.E object at 0x7f26bc100220>
New: <main.E object at 0x7f26bc1008b0>

After more reading, I realized I should be able to use ray.put / ray.get to do this. However I get an error when running the below code when targeting the remote function.

I can run the same code without remote decorator and it works fine.

With the remote function I get this error:

o = ray.get(obj)

ValueError: ‘object_refs’ must either be an object ref or a list of object refs.

I also noticed that even though I pass the original class object reference (stored in the dict), I see that the values are updated again on a copy but not on the original class that was instantiated in main()

I must really be misunderstanding something here about the intent of how Ray is supposed to be working.

Is it possible for a remote function to directly update a class object that was instantiated in main()?

#@ray.remote     #works without remote
def func_1(obj,key): #obj is an class obj
    
    o = ray.get(obj)    
    o[key].doSomething(key)   #Why does this not update the original class obj created in main()?
    return o[key]

class E(object):
    def init(self):
        self.name = ""
        return

    def doSomething(self,name):
        self.name=name
        return
    
    def getName(self):
        return self.name
    
def main():
    objs = {}
    future = []
    
    for i in range(5):
        key = str(i)
        obj = E()        
        objs[str(i)] = obj
        
    o = ray.put(objs)  
    
    for key in objs.keys():
        #future.append(func_1.remote(o,key))  
        future.append(func_1(o,key))    #works without remote
    
    obj2 = ray.get(future)
    
    return

A couple things;

  1. Objects referenced by the object reference is “immutable”. This is Ray’s computational model. That says you cannot modify the object created by ray.put or other ray tasks. If you’d like to achieve this encapsulate the object into an actor. Example, https://docs.google.com/document/d/167rnnDFIVRhHhK4mznEIemOtj63IOhtIPvSYaPgI4Fg/edit#heading=h.eg7m6lz2y48u

  2. When you pass an object reference created by ray.put to other tasks, ray.get is implicitly called.

a = ray.put(np.zeros(10000))
@ray.remote
def f(a):
    # In this case, a is not the object reference, but the original object.
    # Ray implicitly call ray.get(a)
    return a

f.remote(a)

Thanks for the info and pointer to Ray design patterns. This is helpful.

I will try to refactor my code to use this global variable actor pattern.

I still dont know why I am getting the error in my second example – this is really perplexing.

Thanks,
Ryan

To help you understand, you can just add print(type(obj)) in your remote method. This will be the original object, not a reference. Simply put, if the object reference is passed as an argument to other tasks or actors, ray.get is implicitly called. So in your example, ray.get(obj) is already called behind the scene and passed as an argument obj instead of directly passing object reference to the remote task.

Got it. Thanks so much for the explanation!