How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
Hello, I am currently training multi agent PPO on a custom environment, like this:
config = (
PPOConfig()
.api_stack(
enable_rl_module_and_learner=True,
# enable_env_runner_and_connector_v2=True,
)
.environment("RayRWARE")
.framework("torch")
.rollouts(num_rollout_workers=4)
.resources(num_gpus=1)
.environment(env_config={"agents_num": AGENTS_NUM, "max_steps": 1000, "render_mode": "none"})
.evaluation(
evaluation_interval=50,
evaluation_duration=1,
evaluation_config={"env_config": {"agents_num": AGENTS_NUM, "max_steps": 1000, "render_mode": "none"}},
)
)
results = tune.run(
"PPO",
name="PPO_RWARE",
num_samples=1,
keep_checkpoints_num=2,
checkpoint_score_attr="env_runners/episode_reward_mean",
checkpoint_at_end=True,
# scheduler=pbt,
config=config.to_dict(),
stop={"training_iteration": 1000},
checkpoint_freq=50,
callbacks=[
WandbLoggerCallback(
project="rayrware_test",
)
] if USE_WANDB else None,
)
The tune is creating a checkpoint at the end of the training, but I am unable to load it and use later, I am trying to do it like this:
checkpoint = Checkpoint.from_directory(path_to_checkpoint)
print("Checkpoint loaded")
model = ppo.PPO.from_checkpoint(checkpoint, env=RayRWARE)
but it freezes on calling ppo.PPO.from_checkpoint(…).
How should I be handling this? Thanks in advance!