I am trying to pickle a python dictionary object containing ray dataset. The purpose of doing this is to checkpoint the dataset so that it can be used later for fault tolerance.
The above code works when run in local standalone mode.
But when I uncomment in line: ray.init("ray://localhost:10001") to connect to a ray cluster run using local k8s setup, it fails with the below mentioned error:
pickled_map = cloudpickle.dumps(map)
File "<python site packages dir>/ray/cloudpickle/cloudpickle_fast.py", line 73, in dumps
cp.dump(obj)
File "<python site packages dir>/ray/cloudpickle/cloudpickle_fast.py", line 620, in dump
return Pickler.dump(self, obj)
File "<python site packages dir>/ray/util/client/common.py", line 340, in __call__
raise TypeError("Actor methods cannot be called directly. Instead "
TypeError: Actor methods cannot be called directly. Instead of running 'object.__getstate__()', try 'object.__getstate__.remote()'.
Please let me know how can we get the pickling work here.
How severe does this issue affect your experience of using Ray?
Medium: It contributes to significant difficulty to complete my task, but I can work around it.
Hello @Clark_Zinzow , I tried with latest 2.0.0.dev0 (Linux Python 3.7 wheel) but still getting the same error. Again not in local standalone mode, but in k8s based cluster mode.
Hi @Preeti_Joshi, I was able to reproduce this on latest master when using Ray Client, doing a quick debugging push to see if I can determine the underlying issue.
It looks like this is an underlying issue with Ray Client, where direct serialization of an actor handle created under Ray Client fails due to a direct method call check on the actor handle. The below minimal reproduction also fails:
import ray
import ray.cloudpickle as cloudpickle
ray.init("ray://localhost:10001")
@ray.remote
class Foo:
def ping(self):
return "ok"
f = Foo.remote()
s = cloudpickle.dumps(f) # This raises the same error.
Interestingly, in-band serialization of the actor handle by passing it to another task does not fail, so Ray task arguments must go through a slightly different serialization path than direct out-of-band serialization with our cloudpickle:
@ray.remote
def bar(h):
return ray.get(h.ping.remote())
ray.get(bar.remote(f)) # This works just fine.
cc @ckw017 for Ray Client-specific things, do we expect direct, out-of-band serialization of actor handles to work under Ray Client?