Ray.cloudpickle error while pickling ray.dataset

I am trying to pickle a python dictionary object containing ray dataset. The purpose of doing this is to checkpoint the dataset so that it can be used later for fault tolerance.

Here the code which I am using to do so:

import ray
import ray.cloudpickle as cloudpickle
# ray.init("ray://localhost:10001")
ray_dataset = ray.data.range(10)
map = {'key1': 'value1', 'ray_dataset' : ray_dataset}
pickled_map = cloudpickle.dumps(map)
print("pickled_map:" , pickled_map)

map1 = ray.cloudpickle.loads(pickled_map)
print("map1:" , map1)

The above code works when run in local standalone mode.
But when I uncomment in line: ray.init("ray://localhost:10001") to connect to a ray cluster run using local k8s setup, it fails with the below mentioned error:

    pickled_map = cloudpickle.dumps(map)
  File "<python site packages dir>/ray/cloudpickle/cloudpickle_fast.py", line 73, in dumps
    cp.dump(obj)
  File "<python site packages dir>/ray/cloudpickle/cloudpickle_fast.py", line 620, in dump
    return Pickler.dump(self, obj)
  File "<python site packages dir>/ray/util/client/common.py", line 340, in __call__
    raise TypeError("Actor methods cannot be called directly. Instead "
TypeError: Actor methods cannot be called directly. Instead of running 'object.__getstate__()', try 'object.__getstate__.remote()'.

Please let me know how can we get the pickling work here.

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

@Clark_Zinzow could you answer this question?

Hi @Preeti_Joshi, this works for me on latest master, what version of Ray are you using?

I am using ray 1.11.0. Let me try the latest dev 2.0 build though.

Hello @Clark_Zinzow , I tried with latest 2.0.0.dev0 (Linux Python 3.7 wheel) but still getting the same error. Again not in local standalone mode, but in k8s based cluster mode.

Thanks for checking @Preeti_Joshi and apologies for the late response, I momentarily lost track of this thread, but I’ll dig into this on Monday!

Hello @Clark_Zinzow, did you get a chance to look into this issue?

Hello @Clark_Zinzow, following up on this again. Did you get a chance to check the pickling error mentioned in this thread?

Hi @Preeti_Joshi, I was able to reproduce this on latest master when using Ray Client, doing a quick debugging push to see if I can determine the underlying issue.

It looks like this is an underlying issue with Ray Client, where direct serialization of an actor handle created under Ray Client fails due to a direct method call check on the actor handle. The below minimal reproduction also fails:

import ray
import ray.cloudpickle as cloudpickle
ray.init("ray://localhost:10001")

@ray.remote
class Foo:
    def ping(self):
        return "ok"

f = Foo.remote()
s = cloudpickle.dumps(f)  # This raises the same error.

Interestingly, in-band serialization of the actor handle by passing it to another task does not fail, so Ray task arguments must go through a slightly different serialization path than direct out-of-band serialization with our cloudpickle:

@ray.remote
def bar(h):
    return ray.get(h.ping.remote())

ray.get(bar.remote(f))  # This works just fine.

cc @ckw017 for Ray Client-specific things, do we expect direct, out-of-band serialization of actor handles to work under Ray Client?