SAC with shared encoder

vakker00 · March 30, 2024, 7:00am

Hi,

I’m working on a policy that has a larger encoder. For PPO I use a shared encoder and separate vf and policy heads, which is mostly standard as far as I know:

    @override(TorchModelV2)
    def forward(self, input_dict, state, seq_lens):
        self._features = self._encoder(input_dict["obs"])
        logits = self._policy(self._features)
        return logits, state

    @override(TorchModelV2)
    def value_function(self):
        assert self._features is not None, "must call forward() first"
        return self._vf(self._features).squeeze(1)

I wanted to try to apply SAC on the problem, but that seems to directly advise to not share parameters between the Q-net and policy-net (see here:

    @override(TorchModelV2)
    def forward(
        self,
        input_dict: Dict[str, TensorType],
        state: List[TensorType],
        seq_lens: TensorType,
    ) -> (TensorType, List[TensorType]):
        """The common (Q-net and policy-net) forward pass.

        NOTE: It is not(!) recommended to override this method as it would
        introduce a shared pre-network, which would be updated by both
        actor- and critic optimizers.
        """
        return input_dict["obs"], state

Is SAC not fit for this type of weight sharing that PPO provides, or it should be just fine?

Topic		Replies	Views
Training with pre-trained actor and critic using SAC is too slow Configure Algorithm, Training, Evaluation, Scaling	0	344	June 29, 2023
Seperate networks for actor and critic in the ppo RLlib	2	790	April 14, 2022
Ppo centralized critic model with action masking RLlib	2	648	February 14, 2022
Best Practices for Implementing a Shared Critic? Configure Algorithm, Training, Evaluation, Scaling	2	44	June 22, 2025
SAC Networks not Looking like Definition in Configuration RLlib	0	72	March 10, 2024

SAC with shared encoder

Related topics