What is the use for Model Layer "_value_branch" for Gradient Free Optimization (ES, ARS)?

If I Initialize from a predefined model, I notice that there is an additional network for Value Function by default.

However ES/ARS never make use of the Value Function. So is this part of the model even needed or can it be safely deleted ?

If so then the original implementation in ray/es.py at master · ray-project/ray · GitHub should not update the parameters for the value function.

Possible script to run:

tune.run(ESTrainer, config={“env”: “CartPole-v0”,
“framework”: “torch”,
“num_workers”: 1,
“stepsize”: 0.01,
“model”: {
“fcnet_hiddens”: [],
stop={“training_iteration”: 10}

Hey @Zhao_Pengfei , great question. The answer is that there is no use for that branch :slight_smile:
, it’s simply constructed b/c our default models all have that branch in them.

We are currently working on a new model builder API that would get rid of these unneeded value branches and give the user more control over what RLlib’s default models will look like.