PPO from checkpoint

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hello, I am currently training multi agent PPO on a custom environment, like this:

    config = (
        PPOConfig()
        .api_stack(
        enable_rl_module_and_learner=True,
        # enable_env_runner_and_connector_v2=True,
        )
        .environment("RayRWARE")
        .framework("torch")
        .rollouts(num_rollout_workers=4)
        .resources(num_gpus=1)
        .environment(env_config={"agents_num": AGENTS_NUM, "max_steps": 1000, "render_mode": "none"})
        .evaluation(
            evaluation_interval=50,
            evaluation_duration=1,
            evaluation_config={"env_config": {"agents_num": AGENTS_NUM, "max_steps": 1000, "render_mode": "none"}},

        )
    )

    results = tune.run(
        "PPO",
        name="PPO_RWARE",
        num_samples=1,
        keep_checkpoints_num=2,
        checkpoint_score_attr="env_runners/episode_reward_mean",
        checkpoint_at_end=True,
        # scheduler=pbt,
        config=config.to_dict(),
        stop={"training_iteration": 1000},
        checkpoint_freq=50,
        callbacks=[
            WandbLoggerCallback(
                project="rayrware_test",
            )
        ] if USE_WANDB else None,
    )

The tune is creating a checkpoint at the end of the training, but I am unable to load it and use later, I am trying to do it like this:

checkpoint = Checkpoint.from_directory(path_to_checkpoint)
print("Checkpoint loaded")
model = ppo.PPO.from_checkpoint(checkpoint, env=RayRWARE)

but it freezes on calling ppo.PPO.from_checkpoint(…).
How should I be handling this? Thanks in advance!