How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I’ve been testing the DreamerV3 algorithm on CartPole. There is something I do not understand. What is the difference between using:
from ray.rllib.algorithms.dreamerv3.dreamerv3 import DreamerV3Config
config = (
DreamerV3Config()
.environment("CartPole-v1")
.training(
model_size="XS",
training_ratio=1024,
)
)
algo = config.build()
for _ in range(100000):
result = algo.train()
print(pretty_print(result))
and using
from ray import tune
from ray.tune import Tuner
from ray.rllib.algorithms.dreamerv3 import DreamerV3Config
from ray.air import RunConfig, CheckpointConfig
tuner = Tuner(
"DreamerV3",
run_config=RunConfig(
stop={"training_iteration": 100000},
checkpoint_config=CheckpointConfig(checkpoint_at_end=True),
verbose=3
),
param_space=config,
)
result = tuner.fit()
I tested both on CartPole. When I use algo.train() the training gets stuck around 2000 iterations. The other one trains fine but the result is not great at 100000 iterations. What is the correct way to train and evaluate this algorithm in Ray 2.9?
These are my library versions:
Running on Ubuntu 22.04, Python 3.9
ray 2.9.2
gym 0.26.2
gymnasium 0.28.1
tensorboard 2.15.2
tensorboardX 2.6.2.2
tensorflow 2.15.0.post1
tensorflow-estimator 2.15.0
tensorflow-probability 0.23.0