Error creating RLPredictor using restored checkpoint

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi experts.

Get the following error when restoring checkpoint for a DQN model using Ray 2.1.0:

Traceback (most recent call last):
File “/home/stefan/PycharmProjects/RLProjects/rl_offline_trainer/”, line 61, in
predictor = RLPredictor.from_checkpoint(checkpoint)
File “/home/stefan/anaconda3/envs/py38_ray2.1/lib/python3.8/site-packages/ray/train/rl/”, line 63, in from_checkpoint
policy = checkpoint.get_policy(env)
File “/home/stefan/anaconda3/envs/py38_ray2.1/lib/python3.8/site-packages/ray/train/rl/”, line 42, in get_policy
return Policy.from_checkpoint(checkpoint=self)[“default_policy”]
File “/home/stefan/anaconda3/envs/py38_ray2.1/lib/python3.8/site-packages/ray/rllib/policy/”, line 256, in from_checkpoint
policy_state = pickle.load(f)
File “/home/stefan/anaconda3/envs/py38_ray2.1/lib/python3.8/site-packages/ray/_private/”, line 89, in _actor_handle_deserializer
return, outer_id)
File “/home/stefan/anaconda3/envs/py38_ray2.1/lib/python3.8/site-packages/ray/”, line 1281, in _deserialization_helper
return worker.core_worker.deserialize_and_register_actor_handle(
File “python/ray/_raylet.pyx”, line 2137, in ray._raylet.CoreWorker.deserialize_and_register_actor_handle
File “python/ray/_raylet.pyx”, line 2106, in ray._raylet.CoreWorker.make_actor_handle
File “/home/stefan/anaconda3/envs/py38_ray2.1/lib/python3.8/site-packages/ray/_private/”, line 522, in load_actor_class
actor_class = self._load_actor_class_from_gcs(
File “/home/stefan/anaconda3/envs/py38_ray2.1/lib/python3.8/site-packages/ray/_private/”, line 617, in _load_actor_class_from_gcs
class_name = ensure_str(class_name)
File “/home/stefan/anaconda3/envs/py38_ray2.1/lib/python3.8/site-packages/ray/_private/”, line 289, in ensure_str
assert isinstance(s, bytes)

Use Ray Tune to execute several trials with option to save checkpoint at end using the following:

    # create tuner
    tuner = Tuner(

        # trainer

        # create tune configuration

        # hyper-parameters

        # specify run configuration

    # run trials
    result_grid =

Recreate best checkpoint and use it to create RLPredictor at which point the above error occurs:

    # recreate checkpoint
    checkpoint = Checkpoint.from_directory(path=checkpoint_path)

    # create RLPredictor from checkpoint - errors occurs when this executes
    predictor = RLPredictor.from_checkpoint(checkpoint)

From what I can tell the checkpoint folder contains all necessary artifacts. What am I doing wrong?


Hi @steff,

The code looks good. Could you turn this into a GH issue with a complete repro script?


Sure. Is there a web page that describes the steps?

1 Like

Hi @steff ,

Nothing special, I can write the steps down:

  • Go to official ray repo
  • Click issues, create issue
  • Fill out form, include repro script and probably description of what you expected to happen vs was is happing
  • Post link here for reference

Hey @steff, I have the same problem in 2.3.1, did you end up creating an issue for this or finding a resolution?

Created an issue for this. Here is the link: Cannot create RLPredictor using restored checkpoint in different Ray session · Issue #33995 · ray-project/ray · GitHub