Specifically, I get this wacky error message when I am trying to run it
(pid=10029) 2022-01-27 11:53:06,531 INFO rollout_worker.py:1387 -- Built preprocessor map: {'default_policy': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x7f2b8e465280>}
(pid=10029) 2022-01-27 11:53:06,531 INFO rollout_worker.py:614 -- Built filter map: {'default_policy': <ray.rllib.utils.filter.NoFilter object at 0x7f2b8e449910>}
(pid=10029) 2022-01-27 11:53:06,536 ERROR worker.py:428 -- Exception raised in creation task: The actor died because of an error raised in its creation task, ray::PPO.__init__() (pid=10029, ip=172.17.0.6)
== Status ==
Memory usage on this node: 38.7/251.8 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/48 CPUs, 0/1 GPUs, 0.0/165.93 GiB heap, 0.0/75.1 GiB objects
Result logdir: /home/user/aurora-rl/src/outputs/2022-01-27/11-52-33/results/PPO
Number of trials: 1/1 (1 ERROR)
+----------------------+----------+-------+
| Trial name | status | loc |
|----------------------+----------+-------|
| PPO_None_a8cae_00000 | ERROR | |
+----------------------+----------+-------+
Number of errored trials: 1
+----------------------+--------------+-----------------------------------------------------------------------------------------------------------------------+
| Trial name | # failures | error file |
|----------------------+--------------+-----------------------------------------------------------------------------------------------------------------------|
| PPO_None_a8cae_00000 | 1 | /home/user/aurora-rl/src/outputs/2022-01-27/11-52-33/results/PPO/PPO_None_a8cae_00000_0_2022-01-27_11-52-54/error.txt |
+----------------------+--------------+-----------------------------------------------------------------------------------------------------------------------+
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 137, in __init__
(pid=10029) Trainer.__init__(self, config, env, logger_creator)
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 611, in __init__
(pid=10029) super().__init__(config, logger_creator)
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/tune/trainable.py", line 106, in __init__
(pid=10029) self.setup(copy.deepcopy(self.config))
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/agents/trainer_template.py", line 147, in setup
(pid=10029) super().setup(config)
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 793, in setup
(pid=10029) self.evaluation_workers = self._make_workers(
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/agents/trainer.py", line 846, in _make_workers
(pid=10029) return WorkerSet(
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 103, in __init__
(pid=10029) self._local_worker = self._make_worker(
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/evaluation/worker_set.py", line 399, in _make_worker
(pid=10029) worker = cls(
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/evaluation/rollout_worker.py", line 720, in __init__
(pid=10029) self.input_reader: InputReader = input_creator(self.io_context)
(pid=10029) File "/home/user/aurora-rl/src/rllib_drones/server.py", line 37, in _input
(pid=10029) return PolicyServerInput(
(pid=10029) File "/home/user/miniconda/lib/python3.9/site-packages/ray/rllib/env/policy_server_input.py", line 92, in __init__
(pid=10029) HTTPServer.__init__(self, (address, port), handler)
(pid=10029) File "/home/user/miniconda/lib/python3.9/socketserver.py", line 452, in __init__
(pid=10029) self.server_bind()
(pid=10029) File "/home/user/miniconda/lib/python3.9/http/server.py", line 138, in server_bind
(pid=10029) socketserver.TCPServer.server_bind(self)
(pid=10029) File "/home/user/miniconda/lib/python3.9/socketserver.py", line 466, in server_bind
(pid=10029) self.socket.bind(self.server_address)
(pid=10029) OSError: [Errno 98] Address already in use
Error executing job with overrides: []
Traceback (most recent call last):
File "/home/user/miniconda/lib/python3.9/site-packages/clearml/binding/hydra_bind.py", line 146, in _patched_task_function
return task_function(a_config, *a_args, **a_kwargs)
File "/home/user/aurora-rl/src/rllib_drones/server.py", line 94, in main
analysis = tune.run(
File "/home/user/miniconda/lib/python3.9/site-packages/ray/tune/tune.py", line 611, in run
raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [PPO_None_a8cae_00000])