Hi, everybody!
I have been trying the Population-Based Training (PBT) on the TD3 algorithm, and I have a minor question with respect to matching the hyperparameters. My question is " how do I equally set the ‘actor hidden nodes’ and ‘critic hidden nodes’ when I use the PBT on Tune of ray?".
For example, If I set as the below:
config[‘critic_hiddens’] = [tune.choice([[64, 64], [64], [32], [32, 32], [16, 16]])]
config[‘actor_hiddens’] = deepcopy(config[‘critic_hiddens’]),
tune sample each value for each config. I want to make these configs equal.
And the second question is “why does not each task run simultaneously?”
For the second question it’s probably because you don’t have enough resources to run more than 1 trial. Could you send the entire screenshot of the report table? That will show us how many resources you have.
Thank you @amogkam !
Your advices are helpful to me, and sorry for replying late.
I’ve been working on something else for a while.
I think I have enough resources, but I’ll upload my report table.