I’ve noticed that several of the RLlib algorithms have a top-level config parameter named vf_share_layers
.
However, vf_share_layers
is also present in the nested model config.
Is this a bug? Is this intentional? Which to use?
I’ve noticed that several of the RLlib algorithms have a top-level config parameter named vf_share_layers
.
However, vf_share_layers
is also present in the nested model config.
Is this a bug? Is this intentional? Which to use?
I agree, this is very confusing! Thanks for raising awareness for this.
There is a function in ppo_tf_policy.py, which overrides the model config with the top-level one, so the answer is to always use the top-level config key: vf_share_layers
to set this, if available:
def setup_config(...):
...
# Auto set the model option for VF layer sharing.
config["model"]["vf_share_layers"] = config["vf_share_layers"]
I’ll fix this inconsistency and soft-deprecate the top-level config key. That way, if users correctly set the model’s config vf_share_layers
, they don’t get bad surprises (b/c it’s silently overwritten by the Trainer’s value).
So the workaround for now is:
Only use vf_share_layers
in the top-level config for PPO, MAML, and MB-MPO, but not for any other algo.
For PPO, MAML, and MB-MPO, you must use the top-level key as the model config key will always be silently overwritten.