Issue with custom LSTMs

@mannyv

I have gym==0.22.0

@Dylan_Miller,

I am not sure what is going on. I updated the colab to use ray 1.6 and gym 0.22.0 and it is still running fine.

@mannyv

I got it running, that was my mistake, I must have changed something, although I could not tell you what. It was an error in numpy’s random generator, but I do not know how I introduced it. It got resolved by me recopying the code over to my local machine, which I did because we have a docker environment that we are running in so I wanted to test it in there.

That being said, there seems to be something strange with the example you sent. I do not think it is using the custom LSTM. I first tried placing breakpoints for the debugger in forward_rnn(), and then when they did not work I put print statements, and they did not work, so I tried out of the docker container to make sure that was not the issue and saw the same behavior. Then to make sure I wasn’t crazy I added a print statement to forward_rnn() in collab and changed nothing else. Still it did not hit the print statement.

What it does do is print the config and some metrics:

rial PPO_RandomEnv_3bf56_00000 reported episode_reward_max=5.714020108804107,
episode_reward_min=-11.665831744670868,episode_reward_mean=0.08150149951341729,
episode_len_mean=10.702412868632708,
episode_media={},
episodes_this_iter=373,
policy_reward_min={},
policy_reward_max={},
policy_reward_mean={},
custom_metrics={},
sampler_perf={'mean_raw_obs_processing_ms': 0.07456076680575895, 'mean_inference_ms': 0.733874617154554, 'mean_action_processing_ms': 0.06124826483245335, 'mean_env_wait_ms': 0.18261051328964367, 'mean_env_render_ms': 0.0},
off_policy_estimator={},
num_healthy_workers=1,
agent_timesteps_total=12000,
timers={'sample_time_ms': 4245.515, 'sample_throughput': 942.171, 'load_time_ms': 0.034, 'load_throughput': 117872711.944, 'learn_time_ms': 4627.486, 'learn_throughput': 864.4, 'update_time_ms': 1.637},
info={'learner': {'default_policy': {'learner_stats': {'allreduce_latency': 0.0, 'cur_kl_coeff': 0.20000000000000004, 'cur_lr': 5.0000000000000016e-05, 'total_loss': 2.7158842880238767, 'policy_loss': -0.012725519275753409, 'vf_loss': 2.7262376527632437, 'vf_explained_var': array([0.0025146], 
dtype=float32), 'kl': 0.011860704763387402, 'entropy': 2.7788586731879943, 'entropy_coeff': 0.0}, 'model': {}, 'custom_metrics': {}}}, 'num_steps_sampled': 12000, 'num_agent_steps_sampled': 12000, 'num_steps_trained': 12000, 'num_agent_steps_trained': 12000},
perf={'cpu_util_percent': 14.93846153846154, 'ram_util_percent': 50.06923076923078} with parameters={'env': <class 'ray.rllib.examples.env.random_env.RandomEnv'>, 'env_config': {'observation_space': Box(-1.0, 1.0, (15,), float32), 'action_space': Box(0.0, 1.0, (2,), float32)}, 
'model': {'use_lstm': False, 'lstm_cell_size': 512}, 
'num_gpus': 0, 'num_workers': 1, 'framework': 'torch'}

I pulled the example from the rllib github and replaced your simplified version. I am getting the same behaviors. Although this may not be unexpected given that the problem seems to be the config not accepting the custom model.

@Dylan_Miller,

That was my fault I had the model sub-config in there twice so the second one was overriding the custom_model entry.

        "model": {
            "custom_model": TorchRNNModel,
          },
...
            "model": {
                "use_lstm": False,
                "lstm_cell_size": 512,
            },

I updated the config to fix that and added a print in forward_rnn.

@mannyv

Ah, not sure how I missed that.

@mannyv

Running with the fix to the config breaks in other ways.

2022-03-03 09:55:13,651	ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.__init__() (pid=55154, ip=172.20.241.6)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 136, in __init__
    Trainer.__init__(self, config, env, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 592, in __init__
    super().__init__(config, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 103, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 146, in setup
    super().setup(config)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 739, in setup
    self._init(self.config, self.env_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 170, in _init
    self.workers = self._make_workers(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 821, in _make_workers
    return WorkerSet(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 103, in __init__
    self._local_worker = self._make_worker(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 399, in _make_worker
    worker = cls(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 495, in __init__
    policy_dict = _determine_spaces_for_multi_agent_dict(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1458, in _determine_spaces_for_multi_agent_dict
    raise ValueError(
ValueError: `observation_space` not provided in PolicySpec for default_policy and env does not have an observation space OR no spaces received from other workers' env(s) OR no `observation_space` specified in config!
2022-03-03 09:55:13,652	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 8.1/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/12 CPUs, 0/1 GPUs, 0.0/5.06 GiB heap, 0.0/2.53 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 RUNNING)


2022-03-03 09:55:14,192	ERROR trial_runner.py:773 -- Trial PPO_RandomEnv_ee217_00000: Error processing event.
Traceback (most recent call last):
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 739, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 746, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
    return func(*args, **kwargs)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/worker.py", line 1621, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train_buffered()::Exiting (pid=55154, ip=172.20.241.6, repr=PPO)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 178, in train_buffered
    result = self.train()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 651, in train
    raise e
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 637, in train
    result = Trainable.train(self)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 237, in train
    result = self.step()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 193, in step
    res = next(self.train_exec_impl)
AttributeError: 'PPO' object has no attribute 'train_exec_impl'
The trial PPO_RandomEnv_ee217_00000 errored with parameters={'env': <class 'ray.rllib.examples.env.random_env.RandomEnv'>, 'env_config': {'observation_space': Box(-1.0, 1.0, (15,), float32), 'action_space': Box(0.0, 1.0, (2,), float32)}, 'model': {'custom_model': <class '__main__.TorchRNNModel'>}, 'num_gpus': 0, 'num_workers': 1, 'framework': 'torch'}. Error file: /home/dymiller/ray_results/PPO/PPO_RandomEnv_ee217_00000_0_2022-03-03_09-55-13/error.txt
2022-03-03 09:55:14,195	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 8.1/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/12 CPUs, 0/1 GPUs, 0.0/5.06 GiB heap, 0.0/2.53 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
+---------------------------+----------+-------+
| Trial name                | status   | loc   |
|---------------------------+----------+-------|
| PPO_RandomEnv_ee217_00000 | ERROR    |       |
+---------------------------+----------+-------+
Number of errored trials: 1
+---------------------------+--------------+------------------------------------------------------------------------------------------+
| Trial name                |   # failures | error file                                                                               |
|---------------------------+--------------+------------------------------------------------------------------------------------------|
| PPO_RandomEnv_ee217_00000 |            1 | /home/dymiller/ray_results/PPO/PPO_RandomEnv_ee217_00000_0_2022-03-03_09-55-13/error.txt |
+---------------------------+--------------+------------------------------------------------------------------------------------------+

Traceback (most recent call last):
  File "/home/dymiller/projects/project/test_lstm.py", line 96, in <module>
    results = tune.run("PPO", config=config, stop=stop, verbose=2)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/tune.py", line 555, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_RandomEnv_ee217_00000])

I will work on trying to fix these and get back to you a bit later today.

@mannyv

Fixing the config broke it in other ways.

2022-03-03 09:55:13,647	ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=55154, ip=172.20.241.6)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 580, in __init__
    self._build_policy_map(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1375, in _build_policy_map
    self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy_map.py", line 136, in create_policy
    self[policy_id] = class_(observation_space, action_space,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy_template.py", line 242, in __init__
    self.model = ModelCatalog.get_model_v2(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/models/catalog.py", line 537, in get_model_v2
    instance = model_cls(obs_space, action_space, num_outputs,
  File "/home/dymiller/projects/faslane/test_lstm.py", line 28, in __init__
    self.obs_size = get_preprocessor(obs_space)(obs_space).size
NameError: name 'get_preprocessor' is not defined
2022-03-03 09:55:13,649	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-03 09:55:13,649	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-03 09:55:13,651	ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.__init__() (pid=55154, ip=172.20.241.6)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 136, in __init__
    Trainer.__init__(self, config, env, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 592, in __init__
    super().__init__(config, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 103, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 146, in setup
    super().setup(config)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 739, in setup
    self._init(self.config, self.env_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 170, in _init
    self.workers = self._make_workers(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 821, in _make_workers
    return WorkerSet(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 103, in __init__
    self._local_worker = self._make_worker(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 399, in _make_worker
    worker = cls(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 495, in __init__
    policy_dict = _determine_spaces_for_multi_agent_dict(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1458, in _determine_spaces_for_multi_agent_dict
    raise ValueError(
ValueError: `observation_space` not provided in PolicySpec for default_policy and env does not have an observation space OR no spaces received from other workers' env(s) OR no `observation_space` specified in config!
2022-03-03 09:55:13,652	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 8.1/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/12 CPUs, 0/1 GPUs, 0.0/5.06 GiB heap, 0.0/2.53 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 RUNNING)


2022-03-03 09:55:14,192	ERROR trial_runner.py:773 -- Trial PPO_RandomEnv_ee217_00000: Error processing event.
Traceback (most recent call last):
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 739, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 746, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
    return func(*args, **kwargs)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/worker.py", line 1621, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train_buffered()::Exiting (pid=55154, ip=172.20.241.6, repr=PPO)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 178, in train_buffered
    result = self.train()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 651, in train
    raise e
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 637, in train
    result = Trainable.train(self)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 237, in train
    result = self.step()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 193, in step
    res = next(self.train_exec_impl)
AttributeError: 'PPO' object has no attribute 'train_exec_impl'
The trial PPO_RandomEnv_ee217_00000 errored with parameters={'env': <class 'ray.rllib.examples.env.random_env.RandomEnv'>, 'env_config': {'observation_space': Box(-1.0, 1.0, (15,), float32), 'action_space': Box(0.0, 1.0, (2,), float32)}, 'model': {'custom_model': <class '__main__.TorchRNNModel'>}, 'num_gpus': 0, 'num_workers': 1, 'framework': 'torch'}. Error file: /home/dymiller/ray_results/PPO/PPO_RandomEnv_ee217_00000_0_2022-03-03_09-55-13/error.txt
2022-03-03 09:55:14,195	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 8.1/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/12 CPUs, 0/1 GPUs, 0.0/5.06 GiB heap, 0.0/2.53 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
+---------------------------+----------+-------+
| Trial name                | status   | loc   |
|---------------------------+----------+-------|
| PPO_RandomEnv_ee217_00000 | ERROR    |       |
+---------------------------+----------+-------+
Number of errored trials: 1
+---------------------------+--------------+------------------------------------------------------------------------------------------+
| Trial name                |   # failures | error file                                                                               |
|---------------------------+--------------+------------------------------------------------------------------------------------------|
| PPO_RandomEnv_ee217_00000 |            1 | /home/dymiller/ray_results/PPO/PPO_RandomEnv_ee217_00000_0_2022-03-03_09-55-13/error.txt |
+---------------------------+--------------+------------------------------------------------------------------------------------------+

Traceback (most recent call last):
  File "/home/dymiller/projects/project/test_lstm.py", line 96, in <module>
    results = tune.run("PPO", config=config, stop=stop, verbose=2)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/tune.py", line 555, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_RandomEnv_ee217_00000])

I will work on fixing this and get back to you.

@Dylan_Miller,

OK. For the error you listed you just need to import the preprocessor. I had to do that too in the Colab notebook so you can find it there if you have not already.

@mannyv

I did and now I get other errors.

2022-03-03 10:59:41,623	ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=97995, ip=172.20.241.6)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 580, in __init__
    self._build_policy_map(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1375, in _build_policy_map
    self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy_map.py", line 136, in create_policy
    self[policy_id] = class_(observation_space, action_space,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy_template.py", line 279, in __init__
    self._initialize_loss_from_dummy_batch(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy.py", line 750, in _initialize_loss_from_dummy_batch
    self.compute_actions_from_input_dict(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/torch_policy.py", line 299, in compute_actions_from_input_dict
    return self._compute_action_helper(input_dict, state_batches,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/utils/threading.py", line 21, in wrapper
    return func(self, *a, **k)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/torch_policy.py", line 363, in _compute_action_helper
    dist_inputs, state_out = self.model(input_dict, state_batches,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/models/modelv2.py", line 230, in __call__
    res = self.forward(restored, state or [], seq_lens)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/models/torch/recurrent_net.py", line 83, in forward
    output, new_state = self.forward_rnn(inputs, state, seq_lens)
TypeError: cannot unpack non-iterable NoneType object
2022-03-03 10:59:41,624	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-03 10:59:41,625	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-03 10:59:41,627	ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.__init__() (pid=97995, ip=172.20.241.6)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 136, in __init__
    Trainer.__init__(self, config, env, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 592, in __init__
    super().__init__(config, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 103, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 146, in setup
    super().setup(config)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 739, in setup
    self._init(self.config, self.env_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 170, in _init
    self.workers = self._make_workers(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 821, in _make_workers
    return WorkerSet(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 103, in __init__
    self._local_worker = self._make_worker(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 399, in _make_worker
    worker = cls(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 495, in __init__
    policy_dict = _determine_spaces_for_multi_agent_dict(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1458, in _determine_spaces_for_multi_agent_dict
    raise ValueError(
ValueError: `observation_space` not provided in PolicySpec for default_policy and env does not have an observation space OR no spaces received from other workers' env(s) OR no `observation_space` specified in config!
2022-03-03 10:59:41,628	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 9.0/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/12 CPUs, 0/1 GPUs, 0.0/4.57 GiB heap, 0.0/2.29 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 RUNNING)


2022-03-03 10:59:42,153	ERROR trial_runner.py:773 -- Trial PPO_RandomEnv_ef988_00000: Error processing event.
Traceback (most recent call last):
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 739, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 746, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
    return func(*args, **kwargs)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/worker.py", line 1621, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train_buffered()::Exiting (pid=97995, ip=172.20.241.6, repr=PPO)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 178, in train_buffered
    result = self.train()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 651, in train
    raise e
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 637, in train
    result = Trainable.train(self)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 237, in train
    result = self.step()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 193, in step
    res = next(self.train_exec_impl)
AttributeError: 'PPO' object has no attribute 'train_exec_impl'
The trial PPO_RandomEnv_ef988_00000 errored with parameters={'env': <class 'ray.rllib.examples.env.random_env.RandomEnv'>, 'env_config': {'observation_space': Box(-1.0, 1.0, (15,), float32), 'action_space': Box(0.0, 1.0, (2,), float32)}, 'model': {'custom_model': <class '__main__.TorchRNNModel'>}, 'num_gpus': 0, 'num_workers': 1, 'framework': 'torch'}. Error file: /home/dymiller/ray_results/PPO/PPO_RandomEnv_ef988_00000_0_2022-03-03_10-59-41/error.txt
2022-03-03 10:59:42,156	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 9.0/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/12 CPUs, 0/1 GPUs, 0.0/4.57 GiB heap, 0.0/2.29 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
+---------------------------+----------+-------+
| Trial name                | status   | loc   |
|---------------------------+----------+-------|
| PPO_RandomEnv_ef988_00000 | ERROR    |       |
+---------------------------+----------+-------+
Number of errored trials: 1
+---------------------------+--------------+------------------------------------------------------------------------------------------+
| Trial name                |   # failures | error file                                                                               |
|---------------------------+--------------+------------------------------------------------------------------------------------------|
| PPO_RandomEnv_ef988_00000 |            1 | /home/dymiller/ray_results/PPO/PPO_RandomEnv_ef988_00000_0_2022-03-03_10-59-41/error.txt |
+---------------------------+--------------+------------------------------------------------------------------------------------------+

Traceback (most recent call last):
  File "/home/dymiller/projects/project/test_lstm.py", line 98, in <module>
    results = tune.run("PPO", config=config, stop=stop, verbose=2)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/tune.py", line 555, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_RandomEnv_ef988_00000])

Tried in ray versions 1.6, 1.7, 1.8, and 1.9. Similar error in each.

@mannyv

Yea, I had done that. Now I get other errors.

2022-03-03 10:59:41,623	ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.__init__() (pid=97995, ip=172.20.241.6)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 580, in __init__
    self._build_policy_map(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1375, in _build_policy_map
    self.policy_map.create_policy(name, orig_cls, obs_space, act_space,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy_map.py", line 136, in create_policy
    self[policy_id] = class_(observation_space, action_space,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy_template.py", line 279, in __init__
    self._initialize_loss_from_dummy_batch(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/policy.py", line 750, in _initialize_loss_from_dummy_batch
    self.compute_actions_from_input_dict(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/torch_policy.py", line 299, in compute_actions_from_input_dict
    return self._compute_action_helper(input_dict, state_batches,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/utils/threading.py", line 21, in wrapper
    return func(self, *a, **k)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/policy/torch_policy.py", line 363, in _compute_action_helper
    dist_inputs, state_out = self.model(input_dict, state_batches,
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/models/modelv2.py", line 230, in __call__
    res = self.forward(restored, state or [], seq_lens)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/models/torch/recurrent_net.py", line 83, in forward
    output, new_state = self.forward_rnn(inputs, state, seq_lens)
TypeError: cannot unpack non-iterable NoneType object
2022-03-03 10:59:41,624	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-03 10:59:41,625	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
2022-03-03 10:59:41,627	ERROR actor.py:745 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.__init__() (pid=97995, ip=172.20.241.6)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 136, in __init__
    Trainer.__init__(self, config, env, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 592, in __init__
    super().__init__(config, logger_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 103, in __init__
    self.setup(copy.deepcopy(self.config))
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 146, in setup
    super().setup(config)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 739, in setup
    self._init(self.config, self.env_creator)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 170, in _init
    self.workers = self._make_workers(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 821, in _make_workers
    return WorkerSet(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 103, in __init__
    self._local_worker = self._make_worker(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 399, in _make_worker
    worker = cls(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 495, in __init__
    policy_dict = _determine_spaces_for_multi_agent_dict(
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1458, in _determine_spaces_for_multi_agent_dict
    raise ValueError(
ValueError: `observation_space` not provided in PolicySpec for default_policy and env does not have an observation space OR no spaces received from other workers' env(s) OR no `observation_space` specified in config!
2022-03-03 10:59:41,628	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 9.0/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 2.0/12 CPUs, 0/1 GPUs, 0.0/4.57 GiB heap, 0.0/2.29 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 RUNNING)


2022-03-03 10:59:42,153	ERROR trial_runner.py:773 -- Trial PPO_RandomEnv_ef988_00000: Error processing event.
Traceback (most recent call last):
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trial_runner.py", line 739, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/ray_trial_executor.py", line 746, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 82, in wrapper
    return func(*args, **kwargs)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/worker.py", line 1621, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train_buffered()::Exiting (pid=97995, ip=172.20.241.6, repr=PPO)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 178, in train_buffered
    result = self.train()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 651, in train
    raise e
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 637, in train
    result = Trainable.train(self)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/trainable.py", line 237, in train
    result = self.step()
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 193, in step
    res = next(self.train_exec_impl)
AttributeError: 'PPO' object has no attribute 'train_exec_impl'
The trial PPO_RandomEnv_ef988_00000 errored with parameters={'env': <class 'ray.rllib.examples.env.random_env.RandomEnv'>, 'env_config': {'observation_space': Box(-1.0, 1.0, (15,), float32), 'action_space': Box(0.0, 1.0, (2,), float32)}, 'model': {'custom_model': <class '__main__.TorchRNNModel'>}, 'num_gpus': 0, 'num_workers': 1, 'framework': 'torch'}. Error file: /home/dymiller/ray_results/PPO/PPO_RandomEnv_ef988_00000_0_2022-03-03_10-59-41/error.txt
2022-03-03 10:59:42,156	WARNING worker.py:498 -- `ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
== Status ==
Memory usage on this node: 9.0/15.4 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/12 CPUs, 0/1 GPUs, 0.0/4.57 GiB heap, 0.0/2.29 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /home/dymiller/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
+---------------------------+----------+-------+
| Trial name                | status   | loc   |
|---------------------------+----------+-------|
| PPO_RandomEnv_ef988_00000 | ERROR    |       |
+---------------------------+----------+-------+
Number of errored trials: 1
+---------------------------+--------------+------------------------------------------------------------------------------------------+
| Trial name                |   # failures | error file                                                                               |
|---------------------------+--------------+------------------------------------------------------------------------------------------|
| PPO_RandomEnv_ef988_00000 |            1 | /home/dymiller/ray_results/PPO/PPO_RandomEnv_ef988_00000_0_2022-03-03_10-59-41/error.txt |
+---------------------------+--------------+------------------------------------------------------------------------------------------+

Traceback (most recent call last):
  File "/home/dymiller/projects/project/test_lstm.py", line 98, in <module>
    results = tune.run("PPO", config=config, stop=stop, verbose=2)
  File "/home/dymiller/miniconda3/envs/test-lstm/lib/python3.9/site-packages/ray/tune/tune.py", line 555, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_RandomEnv_ef988_00000])

I have tried this in ray 1.6, 1.7, 1.8, and 1.9 and get a similar error in each.

@mannyv

I am still looking through these ones.

@Dylan_Miller,

When I get that error it usually means that I forgot to include a return statement.

@mannyv

Ah, I am missing everything today, thanks.

Hi @Dylan_Miller
Did you solve that problem? I have some sort of this problem, happy to know what you have done