Issue with custom LSTMs

    if state_outs:
            B = 4  # For RNNs, have B=4, T=[depends on sample_batch_size]
            i = 0
            while "state_in_{}".format(i) in postprocessed_batch:
                postprocessed_batch["state_in_{}".format(i)] = \
                    postprocessed_batch["state_in_{}".format(i)][:B]
                if "state_out_{}".format(i) in postprocessed_batch:
                    postprocessed_batch["state_out_{}".format(i)] = \
                        postprocessed_batch["state_out_{}".format(i)][:B]
                i += 1
            seq_len = sample_batch_size // B
            seq_lens = np.array([seq_len for _ in range(B)], dtype=np.int32)
            postprocessed_batch["seq_lens"] = seq_lens
        # Switch on lazy to-tensor conversion on `postprocessed_batch`.
        train_batch = self._lazy_tensor_dict(postprocessed_batch)

In the above code, in policy.py in the function _initialize_loss_from_dummy_batch(), why is the state size restricted to postprocessed_batch["state_in_{}".format(i)][:B]? This seems to break my custom LSTM, because on the first batch the state size was 4x this size.

@Dylan_Miller,

Welcome to the forum. Do you have a coppy of the error stack strace you can share and a reproduction script?

Your model should be agnostic to the sizes o the first dimension (batch size) as this will be different in the sampling and training phases.

In forward the input shape will be [B*T,F]
In forward_rnn they will be [B, T, F].
Batch, Time, Feature sizes

RuntimeError: Expected hidden[0] size (1, 32, 512), got [1, 4, 512]

That is the error I get in the forward pass of the lstm in forward_rnn().

I will work on getting a code sample to reproduce this, but I unfortunately cannot provide the code I am using.

Hi @Dylan_Miller,

and welcome to the discussion board. Do you also override the get_initial_state() method of the Policy class? If so, how do you set the initial state?

I initialize with

@override(ModelV2)
def get_initial_state(self):
    # make hidden states on same device as model
    initial_state = [
        torch.zeros(self._lstm_nb_hidden),
        torch.zeros(self._lstm_nb_hidden),
    ]
    return initial_state

although I have also tried with

    @override(ModelV2)
    def get_initial_state(self):
        # TODO: (sven): Get rid of `get_initial_state` once Trajectory
        #  View API is supported across all of RLlib.
        # Place hidden states on same device as model.
        h = [
            self.fc1.weight.new(1, self.lstm_state_size).zero_().squeeze(0),
            self.fc1.weight.new(1, self.lstm_state_size).zero_().squeeze(0)
        ]
        return h

as is shown in https://github.com/ray-project/ray/blob/ray-1.6.0/rllib/examples/models/rnn_model.py

@Dylan_Miller,

Do you know where the 4 is coming from? Is that a value in your model?

Thr stack trace of which functions are throwing this e rror would be really helpful.

My guess would be it comes from the default rollout_fragment_length in the agent that gets used or it is this B=4 in the custom code above.

Correct, it comes from the B=4, but this is not custom code. This is in rllib in policy.py. That is why my initial question was about what the purpose of that code in rllib is.

Sorry, I forgot the stack trace.

2022-02-25 20:17:26,644 ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=353970, ip=172.17.0.2)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 580, in __init__
    self._build_policy_map(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1375, in _build_policy_map
    self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/policy_map.py", line 136, in create_policy
    self[policy_id] = class_(observation_space, action_space,
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/policy_template.py", line 279, in __init__
    self._initialize_loss_from_dummy_batch(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/policy.py", line 750, in _initialize_loss_from_dummy_batch
    self.compute_actions_from_input_dict(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/torch_policy.py", line 299, in compute_actions_from_input_dict
    return self._compute_action_helper(input_dict, state_batches,
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/utils/threading.py", line 21, in wrapper
    return func(self, *a, **k)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/torch_policy.py", line 363, in _compute_action_helper
    dist_inputs, state_out = self.model(input_dict, state_batches,
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/models/modelv2.py", line 230, in __call__
    res = self.forward(restored, state or [], seq_lens)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/models/torch/recurrent_net.py", line 83, in forward
    output, new_state = self.forward_rnn(inputs, state, seq_lens)
  File "/home/docker/project_models/project_models/ray_alphastar/alphastar.py", line 217, in forward_rnn
    out, (next_h, next_c) = self.lstm(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/torch/nn/modules/rnn.py", line 677, in forward
    self.check_forward_args(input, hx, batch_sizes)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/torch/nn/modules/rnn.py", line 621, in check_forward_args
    self.check_hidden_size(hidden[0], self.get_expected_hidden_size(input, batch_sizes),
  File "/home/docker/miniconda3/lib/python3.9/site-packages/torch/nn/modules/rnn.py", line 226, in check_hidden_size
    raise RuntimeError(msg.format(expected_hidden_size, list(hx.size())))
RuntimeError: Expected hidden[0] size (1, 1, 512), got [1, 32, 512]
2022-02-25 20:17:26,648 WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-02-25 20:17:26,650 WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-02-25 20:17:26,652 WARNING rollout_worker.py:574 -- You are running ray with `local_mode=True`, but have configured 1 GPUs to be used! In local mode, Policies are placed on the CPU and the `num_gpus` setting is ignored.
2022-02-25 20:17:41,042 ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::ConditionalPPOTorchTrainer.__init__() (pid=353970, ip=172.17.0.2)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 136, in __init__
    Trainer.__init__(self, config, env, logger_creator)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 592, in __init__
    super().__init__(config, logger_creator)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/tune/trainable.py", line 103, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 146, in setup
    super().setup(config)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 739, in setup
    self._init(self.config, self.env_creator)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 170, in _init
    self.workers = self._make_workers(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 821, in _make_workers
    return WorkerSet(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 103, in __init__
    self._local_worker = self._make_worker(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 399, in _make_worker
    worker = cls(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 580, in __init__
    self._build_policy_map(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1375, in _build_policy_map
    self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/policy_map.py", line 136, in create_policy
    self[policy_id] = class_(observation_space, action_space,
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/policy_template.py", line 279, in __init__
    self._initialize_loss_from_dummy_batch(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/policy.py", line 750, in _initialize_loss_from_dummy_batch
    self.compute_actions_from_input_dict(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/torch_policy.py", line 299, in compute_actions_from_input_dict
    return self._compute_action_helper(input_dict, state_batches,
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/utils/threading.py", line 21, in wrapper
    return func(self, *a, **k)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/policy/torch_policy.py", line 363, in _compute_action_helper
    dist_inputs, state_out = self.model(input_dict, state_batches,
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/models/modelv2.py", line 230, in __call__
    res = self.forward(restored, state or [], seq_lens)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/models/torch/recurrent_net.py", line 83, in forward
    output, new_state = self.forward_rnn(inputs, state, seq_lens)
  File "/home/docker/project_models/project_models/ray_alphastar/alphastar.py", line 217, in forward_rnn
    out, (next_h, next_c) = self.lstm(
  File "/home/docker/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/torch/nn/modules/rnn.py", line 677, in forward
    self.check_forward_args(input, hx, batch_sizes)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/torch/nn/modules/rnn.py", line 621, in check_forward_args
    self.check_hidden_size(hidden[0], self.get_expected_hidden_size(input, batch_sizes),
  File "/home/docker/miniconda3/lib/python3.9/site-packages/torch/nn/modules/rnn.py", line 226, in check_hidden_size
    raise RuntimeError(msg.format(expected_hidden_size, list(hx.size())))
RuntimeError: Expected hidden[0] size (1, 1, 512), got [1, 32, 512]
2022-02-25 20:17:41,045 WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-02-25 20:17:41,587 ERROR syncer.py:72 -- Log sync requires rsync to be installed.
== Status ==
Memory usage on this node: 11.5/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/2 CPUs, 1.0/1 GPUs, 0.0/7.63 GiB heap, 0.0/3.81 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspaces/projects/project/ray_results/QPR2/experiments
Number of trials: 1/1 (1 RUNNING)
+-----------------------------------------------------------+----------+-------+--------------------+--------------------+
| Trial name                                                | status   | loc   | Custom Metrics 1   | Custom Metrics 2   |
|-----------------------------------------------------------+----------+-------+--------------------+--------------------|
| ConditionalPPOTorchTrainer_HNS_RLLib_Agent_v1_eafce_00000 | RUNNING  |       |                    |                    |
+-----------------------------------------------------------+----------+-------+--------------------+--------------------+


2022-02-25 20:17:41,616 ERROR trial_runner.py:773 -- Trial ConditionalPPOTorchTrainer_HNS_RLLib_Agent_v1_eafce_00000: Error processing event.
Traceback (most recent call last):
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 739, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 746, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
    return func(*args, **kwargs)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/worker.py", line 1621, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::ConditionalPPOTorchTrainer.train()::Exiting (pid=353970, ip=172.17.0.2, repr=ConditionalPPOTorchTrainer)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 651, in train
    raise e
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 637, in train
    result = Trainable.train(self)
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/tune/trainable.py", line 237, in train
    result = self.step()
  File "/home/docker/miniconda3/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 193, in step
    res = next(self.train_exec_impl)
AttributeError: 'ConditionalPPOTorchTrainer' object has no attribute 'train_exec_impl'
2022-02-25 20:17:41,627 WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 11.5/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/2 CPUs, 0/1 GPUs, 0.0/7.63 GiB heap, 0.0/3.81 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspaces/projects/project/ray_results/QPR2/experiments
Number of trials: 1/1 (1 ERROR)
+-----------------------------------------------------------+----------+-------+--------------------+--------------------+
| Trial name                                                | status   | loc   | Custom Metrics 1   | Custom Metrics 2   |
|-----------------------------------------------------------+----------+-------+--------------------+--------------------|
| ConditionalPPOTorchTrainer_HNS_RLLib_Agent_v1_eafce_00000 | ERROR    |       |                    |                    |
+-----------------------------------------------------------+----------+-------+--------------------+--------------------+
Number of errored trials: 1
+-----------------------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Trial name                                                |   # failures | error file                                                                                                                                          |
|-----------------------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------|
| ConditionalPPOTorchTrainer_HNS_RLLib_Agent_v1_eafce_00000 |            1 | /workspaces/projects/project/ray_results/QPR2/experiments/ConditionalPPOTorchTrainer_HNS_RLLib_Agent_v1_eafce_00000_0_2022-02-25_20-17-13/error.txt |
+-----------------------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+

Hi @Dylan_Miller,

The purpose of that code is to determine how to set up the ViewRequirements. RLlib does three calls with dummy data. These check which values are used from the input dictionary during the sample phase and the training phase.

To help further I would need to know

in froward:
The shape of obs,
The len of seq_lens, and lens of each seq in the list

in forward_rnn
The shape of the input observation
The seq len info again ( should be the same as in forward)
The shape of each state in the states list.

Feel free to DM me if you want.

The shape of the obs is 4x15, the length of seq_len is initially 32, at this point it is all ones as it is [1]*batch_size, and when the 4 is introduced it gets reduced to [8]*B, so length of 4 and all 8s.

The shape of the state is initially 32x512 (two of these in the list obviously), then this gets unsqueezed in forqward_rnn to 1x32x512. This becomes 4x512, unsqueezed to 1x4x512, which is where the issue occurs.

You have both forward and forward_rnn in the most recent reply, I should only have forward_rnn, correct? There is stil the forward inside rllib, but I only need to override forward_rnn?

I don’t know if it helps, but if I unqueeze to 32x1x512 I get the error RuntimeError: Expected hidden[0] size (1, 1, 512), got [32, 1, 512] on the very first pass through.

@Dylan_Miller,

Let me keep thinking on the error. You only need to overload forward if you have an observation that you want to apply customized processing before the lstm or action masking you want to apply to the outputs.

@mannyv Great, thank you

@Dylan_Miller,

Maybe we can try this. Here is a google colab I put together with a custom lstm that is working.

I think it has the same observation and LSTM cell size as you shared. Maybe it will help you fix your or we can break this one in the way that yours is breaking.

@mannyv ,

This is the example at ray/rnn_model.py at ray-1.6.0 · ray-project/ray · GitHub, correct?

I have been trying to use this as a guideline, but I will play with it here and see if it helps. Thanks.

@Dylan_Miller,

I took I. fom master but it should be very similar if not the same. I simplified it a lot to focus on the relevant parts.

2022-03-02 10:51:48,735	WARNING deprecation.py:38 -- DeprecationWarning: `ray.rllib.examples.env.multi_agent.make_multiagent` has been deprecated. Use `ray.rllib.env.multi_agent_env.make_multi_agent` instead. This will raise an error in the future!
/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/_private/services.py:238: UserWarning: Not all Ray Dashboard dependencies were found. To use the dashboard please install Ray using `pip install ray[default]`. To disable this message, set RAY_DISABLE_IMPORT_WARNING env var to '1'.
  warnings.warn(warning_message)
2022-03-02 10:51:49,050	WARNING services.py:1749 -- WARNING: The object store is using /home/dymiller/tmp instead of /dev/shm because /dev/shm has only 1772220416 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=2.29gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
2022-03-02 10:51:50,122	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-02 10:51:50,164	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-02 10:51:50,172	INFO ppo.py:158 -- In multi-agent mode, policies will be optimized sequentially by the multi-GPU optimizer. Consider setting simple_optimizer=True if this doesn't work for you.
2022-03-02 10:51:50,172	INFO trainer.py:726 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
2022-03-02 10:51:50,196	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-02 10:51:50,283	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-02 10:51:50,285	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-02 10:51:50,296	ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.__init__() (pid=315255, ip=192.168.1.157)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 136, in __init__
    Trainer.__init__(self, config, env, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 592, in __init__
    super().__init__(config, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 103, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 146, in setup
    super().setup(config)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 739, in setup
    self._init(self.config, self.env_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 170, in _init
    self.workers = self._make_workers(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 821, in _make_workers
    return WorkerSet(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 103, in __init__
    self._local_worker = self._make_worker(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 399, in _make_worker
    worker = cls(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 580, in __init__
    self._build_policy_map(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1375, in _build_policy_map
    self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy_map.py", line 136, in create_policy
    self[policy_id] = class_(observation_space, action_space,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy_template.py", line 279, in __init__
    self._initialize_loss_from_dummy_batch(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy.py", line 746, in _initialize_loss_from_dummy_batch
    self._dummy_batch = self._get_dummy_batch_from_view_requirements(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy.py", line 875, in _get_dummy_batch_from_view_requirements
    ret[view_col] = np.zeros_like([
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy.py", line 876, in <listcomp>
    view_req.space.sample() for _ in range(batch_size)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/gym/spaces/discrete.py", line 28, in sample
    return self.start + self.np_random.randint(self.n)
AttributeError: 'numpy.random._generator.Generator' object has no attribute 'randint'
2022-03-02 10:51:50,297	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 9.7/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/12 CPUs, 0/1 GPUs, 0.0/4.16 GiB heap, 0.0/2.08 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 RUNNING)


2022-03-02 10:51:50,815	ERROR trial_runner.py:773 -- Trial PPO_RandomEnv_ac394_00000: Error processing event.
Traceback (most recent call last):
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 739, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 746, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
    return func(*args, **kwargs)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/worker.py", line 1621, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train_buffered()::Exiting (pid=315255, ip=192.168.1.157, repr=PPO)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 178, in train_buffered
    result = self.train()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 651, in train
    raise e
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 637, in train
    result = Trainable.train(self)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 237, in train
    result = self.step()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 193, in step
    res = next(self.train_exec_impl)
AttributeError: 'PPO' object has no attribute 'train_exec_impl'
The trial PPO_RandomEnv_ac394_00000 errored with parameters={'env': <class 'ray.rllib.examples.env.random_env.RandomEnv'>, 'model': {'use_lstm': False, 'lstm_cell_size': 512}, 'num_gpus': 0, 'num_workers': 1, 'framework': 'torch'}. Error file: /home/dymiller/ray_results/PPO/PPO_RandomEnv_ac394_00000_0_2022-03-02_10-51-50/error.txt
2022-03-02 10:51:50,822	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 9.7/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/12 CPUs, 0/1 GPUs, 0.0/4.16 GiB heap, 0.0/2.08 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
+---------------------------+----------+-------+
| Trial name                | status   | loc   |
|---------------------------+----------+-------|
| PPO_RandomEnv_ac394_00000 | ERROR    |       |
+---------------------------+----------+-------+
Number of errored trials: 1
+---------------------------+--------------+------------------------------------------------------------------------------------------+
| Trial name                |   # failures | error file                                                                               |
|---------------------------+--------------+------------------------------------------------------------------------------------------|
| PPO_RandomEnv_ac394_00000 |            1 | /home/dymiller/ray_results/PPO/PPO_RandomEnv_ac394_00000_0_2022-03-02_10-51-50/error.txt |
+---------------------------+--------------+------------------------------------------------------------------------------------------+

Traceback (most recent call last):
  File "/home/dymiller/projects/project/test_lstm.py", line 103, in <module>
    results = tune.run("PPO", config=config, stop=stop, verbose=2)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/tune.py", line 555, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_RandomEnv_ac394_00000])

@mannyv

I am getting the above error when I try to run this. I am using ray 1.6 (which I am currently restricted to). Could this have anything to do with it?

@Dylan_Miller,

I updated the colab to use ray 1.6. It runs fine. What version of gym do you have. The colab has gym==0.17.3.