Attribute error when trying to compute actions after training DreamerV3 on Cartpole

Alian3785 · October 7, 2023, 1:27pm

High: It blocks me to complete my task.

So I train cartpole-v1 with DreamerV3 using tune in Colab with free GPU

from ray import tune
from ray.tune import Tuner
from ray.rllib.algorithms.dreamerv3 import DreamerV3Config
from ray.air import RunConfig, CheckpointConfig

# Define the configuration for PPO training
config = (DreamerV3Config().environment("CartPole-v1").training(model_size="XS",training_ratio=1024,).resources(num_gpus=1, num_gpus_per_learner_worker=1, num_learner_workers=0))

# Define the tuner
tuner = Tuner(
    "DreamerV3",
    run_config=RunConfig(
        stop={"training_iteration": 4000},
        checkpoint_config=CheckpointConfig(checkpoint_at_end=True)
    ),
    param_space=config,
)

# Run the tuner
result = tuner.fit()

It trains in an hour. Then I take a checkpoint

from ray.rllib.algorithms.algorithm import Algorithm
algo = Algorithm.from_checkpoint("/root/ray_results/DreamerV3_2023-10-04_14-17-04/DreamerV3_CartPole-v1_b623a_00000_0_2023-10-04_14-17-12/checkpoint_000000")

And try to compute actions almost exactly as documentation example - Getting Started with RLlib — Ray 2.7.0

# Note: `gymnasium` (not `gym`) will be **the** API supported by RLlib from Ray 2.3 on.
try:
    import gymnasium as gym

    gymnasium = True
except Exception:
    import gym

    gymnasium = False

from ray.rllib.algorithms.dreamerv3.dreamerv3 import DreamerV3Config

env_name = "CartPole-v1"
env = gym.make(env_name)
#algo = PPOConfig().environment(env_name).build()

episode_reward = 0
terminated = truncated = False

if gymnasium:
    obs, info = env.reset()
else:
    obs = env.reset()

while not terminated and not truncated:
    action = algo.compute_single_action(obs)
    if gymnasium:
        obs, reward, terminated, truncated, info = env.step(action)
    else:
        obs, reward, terminated, info = env.step(action)
    episode_reward += reward

But I get an error: AttributeError: ‘DreamerV3EnvRunner’ object has no attribute ‘get_policy’
Is something still not ready with Dreamerv3 or am I doing something wrong? With other algorithms everything works fine.

chus1818 · October 22, 2023, 3:06pm

I am having the same problem - See this PR for updated documentation about how to do it and the issue where this is discussed.

Alex-Golod · December 4, 2023, 12:34pm

Hello,
Now it’s written at the end of Readme.

Running Action Inference after Training

To run action inference on a DreamerV3 Algorithm object, you can use this simple environment loop script.

Note the slight complexity caused by the fact that DreamerV3 a) uses a recurrent model, b) uses the new RLModule-based API stack (no Policy class), and c) outputs actions in a one-hot fashion for discrete action spaces.

Topic		Replies	Views
Dreamerv3 default cartpole example not learning? RLlib	0	196	March 1, 2024
All Algorithms are registered and DreamerV3 fails for CartpoleDebug-v0 RLlib	1	135	April 13, 2024
What is the correct way to train the DreamerV3 algorithm? RLlib	0	139	August 4, 2024
Dreamer V3 with CartPole environment in Ray 2.9.2 RLlib	0	73	August 2, 2024
Example of A3C only use CPU for trainer RLlib	10	849	July 23, 2021

Attribute error when trying to compute actions after training DreamerV3 on Cartpole

Running Action Inference after Training

Related topics