Why done is almost always False?

deepgravity · October 3, 2021, 7:30pm

Hi all,
I am training Rainbow agent for CartPole-v0 with ray.tune.
While the agent learns to keep the pole stable, my done curve in Tensorboard is always False. Please see this image. Does anyone know what is the problem?

I expect to have done=True when an episode terminates. But, seems this is not the case.

Thanks!

mannyv · October 3, 2021, 8:03pm

@deepgravity,

That done is telling you whether the tune trial (the rl training) is done not the state of episodes in the environment.

deepgravity · October 3, 2021, 8:58pm

Hi @mannyv,
Thanks for your response.
So in my example, done is always 0. What does it mean?
And would you please clarify what you meant by the rl training?

mannyv · October 3, 2021, 9:15pm

When you call tune.run you are asking tune to run some trials of something. In this case you ares asking tune to use rllib to train an rl problem. It looks like you only have 1 trial.

You have only asked tune to train one thing. You could have done multiple by asking it to for example train with two different learning rates to compare performance or you could have set num_samples>1. Then it would run several different identical configurations so you could look at variability.

In those cases you have multiple trials some of which might be running at the same time and others which might be queued until others finish. For example you ask for 10 trials and you only have 8cpusand use 1 cpu per trial. You would have to wait for 2 trials to finish before 9 and 10 start.

The tune value you are looking at is telling you if the trial is still training or if it has finished.

deepgravity · October 5, 2021, 3:22pm

Many thanks @mannyv for the detailed explanation.

“It looks like you only have 1 trial.”
Yes, I only have one trial now.

So, trial in Ray refers to the number of configs. If I use tune for hyperparameter tunning for learning rate, let’s say for only two values lr1, lr2, then I have 2 trials?

mannyv · October 5, 2021, 4:09pm

Yes, that is correct!

Topic		Replies	Views
Bad inference after perfect training. What am I missing? RLlib	3	749	June 8, 2022
Trouble with some results from Ray Tune	1	42	August 7, 2024
Empty checkpoint files with Tune.run RLlib	1	387	March 30, 2022
All Trials TERMINATED However Training Is Finished Ray Tune	1	971	September 23, 2021
[Tune] [RLlib] Episodes vs iterations vs trials vs experiments RLlib	1	2323	June 3, 2021

Why done is almost always False?

Related topics