Dear community,
I am wondering about suprisingly low GPU usage on my Windows machine for DQN training with RlLib. My interest is in both directions: On the one hand, training should be conduced fast, on the other hand, RAM consumption must be closely observed.
I am running a config similar to the one below - just changing to CartPole instead of my custom environment.
Two strange observations:
- While the runtime increases iteration-by-iteration, e.g, iteration #34 required 23 min while iteration #8 required 5 min, the CPU consumption lowers down to 34-40% at average in the later iterations.
- The GPU consumption stays very low (1%) and just shows peaks.
First idea to explain the sporadic peaks is the policy update at the end of a rollout batch.
Is this true?
Code:
config = (
DQNConfig()
.environment(CustomEnv())
.rollouts(num_rollout_workers=3, num_envs_per_worker=1, batch_mode="complete_episodes")
.framework("torch")
.experimental(_enable_new_api_stack=False)
.evaluation(evaluation_num_workers=1, evaluation_interval=1000)
.resources(num_gpus=1, num_cpus_per_worker=3, num_gpus_per_worker=0.2)
.debugging(log_level="ERROR")
.reporting(
min_sample_timesteps_per_iteration=500
) # Ensures that in "progress.csv" the timesteps are not listed separately for each iteration.
.training(
hiddens=[],
dueling=False,
train_batch_size=train_batch_size,
training_intensity=False,
)
)
iteration_num = 50
for iteration in tqdm(range(iteration_num)):
result = algo.train()