Question on code: _wrapped_forward

mg64ve · November 29, 2021, 5:44pm

Hello,

can anyone explain me what is the reason of the following two lines:

    # Push obs through "unwrapped" net's `forward()` first.
    wrapped_out, _ = self._wrapped_forward(input_dict, [], None)

in the following file:

github.com

ray-project/ray/blob/e6ae08f41674d2ac1423f3c2a4f8d8bd3500379a/rllib/models/torch/recurrent_net.py

import numpy as np
import gym
from gym.spaces import Discrete, MultiDiscrete
from typing import Dict, List, Union

from ray.rllib.models.modelv2 import ModelV2
from ray.rllib.models.torch.misc import SlimFC
from ray.rllib.models.torch.torch_modelv2 import TorchModelV2
from ray.rllib.policy.rnn_sequencing import add_time_dimension
from ray.rllib.policy.sample_batch import SampleBatch
from ray.rllib.policy.view_requirement import ViewRequirement
from ray.rllib.utils.annotations import override, DeveloperAPI
from ray.rllib.utils.framework import try_import_torch
from ray.rllib.utils.torch_utils import one_hot
from ray.rllib.utils.typing import ModelConfigDict, TensorType

torch, nn = try_import_torch()


@DeveloperAPI

This file has been truncated. show original

it is in the LSTMWrapper class and forward method.
Thanks.

mannyv · November 30, 2021, 2:03am

Hi @mg64ve,

When you set config["model"]["use_lstm"] =True, rllib will automatically create an lstm for you. This is the code you are looking at.

The way it does it is it first creates a model (base) with the layers before the lstm. These are fullyconnected or conv layers. Then it creates a second model that has the lstm and output layers (logits and value).

That line you asked about is when it passes the inputs from the environment to the base model to get the embedding inputs for the lstm.

mg64ve · November 30, 2021, 10:39am

Ok Thank you very much @mannyv for your explanation.
I would like to implement a custom RNN model with a L2 weights regularization.
From this comment:

# === Built-in options ===
# FullyConnectedNetwork (tf and torch): rllib.models.tf|torch.fcnet.py
# These are used if no custom model is specified and the input space is 1D.
# Number of hidden layers to be used.
"fcnet_hiddens": [256, 256],
# Activation function descriptor.
# Supported values are: "tanh", "relu", "swish" (or "silu"),
# "linear" (or None).
"fcnet_activation": "tanh",

I believe for custom models there is no FC input layer, right?
Now, if I implement my custom model, I believe this would not have a previous action/reward stacking mechanism, right? any good hint on how to implement it?

Topic		Replies	Views
PPO+LSTM custom model implementation problem ray2.10.0 Configure Algorithm, Training, Evaluation, Scaling	3	188	May 9, 2024
'use_lstm' wrapping in older and newer Ray versions RLlib	0	639	March 16, 2022
Impala Deep Residual (Custom) Model RLlib	2	402	November 23, 2022
RNN L2 weights regularization RLlib	41	2052	July 5, 2021
Custom RNN Model with Examples - why do they fail? RLlib	11	2358	May 5, 2022

Question on code: _wrapped_forward

Related topics