I am new to Ray and am seeing some behavior I didn’t expect.
I would like to understand how Ray handles passing obj references to remote functions.
Say I have a class “E”, which I instantiate as ‘obj’.
I pass obj to a remote function ‘func_1’, which does something, but it is (unexpectedly) doing it on a copy of the original obj.
I expected each remote Ray function call to operate on the passed object reference directly, but it is creating a new object and operating on that instead.
Is there a way to get the remote function to act on the original object reference and not a copy?
@ray.remote
def func_1(obj): #obj is an class obj
ref = obj.doSomething()
return obj
class E(object):
def init(self):
return
def doSomething(self):
return
def main():
objs = []
future = []
for _ in range(5):
obj = E()
objs.append(obj)
print("Original: " + str(obj))
future.append(func_1.remote(obj))
obj2 = ray.get(future)
[print("New: " + str(o)) for o in obj2]
return
if __name__ == "__main__":
main()
================================== Output:
Original: <main.E object at 0x7f26cc111ca0>
Original: <main.E object at 0x7f26bc13d730>
Original: <main.E object at 0x7f26cc0e2c70>
Original: <main.E object at 0x7f269c5a5520>
Original: <main.E object at 0x7f26bc100520>
New: <main.E object at 0x7f26bc100640>
New: <main.E object at 0x7f26bc13deb0>
New: <main.E object at 0x7f26bc100460>
New: <main.E object at 0x7f26bc100220>
New: <main.E object at 0x7f26bc1008b0>
After more reading, I realized I should be able to use ray.put / ray.get to do this. However I get an error when running the below code when targeting the remote function.
I can run the same code without remote decorator and it works fine.
With the remote function I get this error:
o = ray.get(obj)
ValueError: ‘object_refs’ must either be an object ref or a list of object refs.
I also noticed that even though I pass the original class object reference (stored in the dict), I see that the values are updated again on a copy but not on the original class that was instantiated in main()
I must really be misunderstanding something here about the intent of how Ray is supposed to be working.
Is it possible for a remote function to directly update a class object that was instantiated in main()?
#@ray.remote #works without remote
def func_1(obj,key): #obj is an class obj
o = ray.get(obj)
o[key].doSomething(key) #Why does this not update the original class obj created in main()?
return o[key]
class E(object):
def init(self):
self.name = ""
return
def doSomething(self,name):
self.name=name
return
def getName(self):
return self.name
def main():
objs = {}
future = []
for i in range(5):
key = str(i)
obj = E()
objs[str(i)] = obj
o = ray.put(objs)
for key in objs.keys():
#future.append(func_1.remote(o,key))
future.append(func_1(o,key)) #works without remote
obj2 = ray.get(future)
return
When you pass an object reference created by ray.put to other tasks, ray.get is implicitly called.
a = ray.put(np.zeros(10000))
@ray.remote
def f(a):
# In this case, a is not the object reference, but the original object.
# Ray implicitly call ray.get(a)
return a
f.remote(a)
To help you understand, you can just add print(type(obj)) in your remote method. This will be the original object, not a reference. Simply put, if the object reference is passed as an argument to other tasks or actors, ray.get is implicitly called. So in your example, ray.get(obj) is already called behind the scene and passed as an argument obj instead of directly passing object reference to the remote task.