mg64ve
December 29, 2021, 12:59pm
1
Hi, I have just put the attention net with requirement in one file:
import numpy as np
import gym
from gym.spaces import Box, Discrete, MultiDiscrete
from typing import Dict, List, Union
from gym.envs.classic_control import CartPoleEnv
import ray
from ray import tune
from ray.tune.registry import register_env
from ray.rllib.models import ModelCatalog
from ray.rllib.utils.test_utils import check_learning_achieved
from ray.rllib.models.torch.torch_modelv2 import TorchModelV2
from ray.rllib.models.torch.misc import SlimFC
from ray.rllib.models.preprocessors import get_preprocessor
from ray.rllib.models.torch.recurrent_net import RecurrentNetwork
from ray.rllib.utils.annotations import override
from ray.rllib.utils.framework import try_import_torch
from ray.rllib.policy.view_requirement import ViewRequirement
from ray.rllib.utils.torch_utils import one_hot as torch_one_hot
from ray.rllib.utils.typing import ModelConfigDict, TensorType
This file has been truncated. show original
and then I replaced the model with LSTM only without initial fc layers.
It is in this file:
import numpy as np
import gym
from gym.spaces import Box, Discrete, MultiDiscrete
from typing import Dict, List, Union
from gym.envs.classic_control import CartPoleEnv
import torch
from torch import nn
from torch.autograd import Variable
import ray
from ray import tune
from ray.tune.registry import register_env
from ray.rllib.models import ModelCatalog
from ray.rllib.utils.test_utils import check_learning_achieved
from ray.rllib.models.torch.torch_modelv2 import TorchModelV2
from ray.rllib.models.torch.misc import SlimFC
from ray.rllib.models.preprocessors import get_preprocessor
from ray.rllib.models.torch.recurrent_net import RecurrentNetwork
from ray.rllib.utils.annotations import override
This file has been truncated. show original
the problem I am facing is with the get_initial_state function.
I have tried many options and no one seems to be good:
def get_initial_state(self):
# h = [
# torch.zeros(self.lstm_size),
# torch.zeros(self.lstm_size)
# ]
# h = [
# torch.zeros(1, self.lstm_size),
# torch.zeros(1, self.lstm_size)
# ]
h = self.lstm.weight_hh_l0.data.fill_(0)
return h
Can you please give me an hint on this?
in pure pytorch one of these methods should work but in ray it does not.
I am getting the following error:
(PPOTrainer pid=133035) RuntimeError: Expected hidden[0] size (1, 32, 16), got [1, 4, 16]
Hey @mg64ve , thanks for the question! Some problems I see in your implementation of get_initial_state
:
The return value should always be a list of state tensors, so in your case, a list with one single item, which is the h-state-tensor (you are returning h directly w/o the list).
You seem to return a state tensor that has the same shape as the weight matrix, but I think you should return a state tensor that has the same shape as the bias vector.
Also, state tensors in your returned list should all be non-batched, but I think you are doing this correctly here.