Error when run PPOTrainer

mingjunwang88 · October 15, 2021, 2:57pm

This is code I run:
config = DEFAULT_CONFIG.copy()
config[‘num_workers’] = 1
config[‘num_sgd_iter’] = 30
config[‘sgd_minibatch_size’] = 128
config[‘model’][‘fcnet_hiddens’] = [100, 100]
config[‘num_cpus_per_worker’] = 1

agent = PPOTrainer(config, ‘CartPole-v1’)

I am having this error:

RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=127995, ip=172.16.35.94)
AttributeError: ‘NoneType’ object has no attribute ‘config’

rusu24edward · October 15, 2021, 3:19pm

Can you provide more code? It looks like the Python interpreter is unable to find DEFAULT_CONFIG. Have you imported it correctly?

mingjunwang88 · October 15, 2021, 3:26pm

This is import:
from ray.rllib.agents.ppo import PPOTrainer, DEFAULT_CONFIG
from ray.tune.logger import pretty_print
import json
import pandas as pd
import gym

It looks normal. I tried the same code on Mac. It works fine. I am working a liunx on aws: here is the version: Linux ip-172-16-35-94 4.14.238-125.422.amzn1.x86_64 #1 SMP Tue Jul 20 20:51:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

It looks like all the trainers have the same problem. Anyone have any idea?

mannyv · October 15, 2021, 8:43pm

Hi @mingjunwang88,

Is there more to the error log. I think is probably not the main error. I often see that when there is some other initial error.

What version of ray are you using? Here is a google colab with the setup you posted that is running OK.

mingjunwang88 · October 15, 2021, 9:15pm

Hi @mannyv : thanks for the responding! I did run the same script om Mac. it ran through. But I am running it on a AWS sagemaker instance. I installed as you did: pip install -u ray[all]. But it just failed before. Here is the complete log:

RayActorError Traceback (most recent call last)
in
12 config[“num_cpus_per_worker”] = 1
13
—> 14 agent = PPOTrainer(config, “CartPole-v1”)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py in init(self, config, env, logger_creator)
135
136 def init(self, config=None, env=None, logger_creator=None):
→ 137 Trainer.init(self, config, env, logger_creator)
138

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py in init(self, config, env, logger_creator)
609 logger_creator = default_logger_creator
610
→ 611 super().init(config, logger_creator)
612

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/tune/trainable.py in init(self, config, logger_creator)
104
105 start_time = time.time()
→ 106 self.setup(copy.deepcopy(self.config))
107 setup_time = time.time() - start_time
108 if setup_time > SETUP_TIME_THRESHOLD:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py in setup(self, config)
145 self._override_all_subkeys_if_type_changes +=
146 override_all_subkeys_if_type_changes
→ 147 super().setup(config)
148
149 def _init(self, config: TrainerConfigDict,

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py in setup(self, config)
762 logging.getLogger(“ray.rllib”).setLevel(self.config[“log_level”])
763
→ 764 self._init(self.config, self.env_creator)
765
766 # Evaluation setup.

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py in _init(self, config, env_creator)
174 policy_class=self._policy_class,
175 config=config,
→ 176 num_workers=self.config[“num_workers”])
177 self.execution_plan = execution_plan
178 try:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py in _make_workers(self, env_creator, validate_env, policy_class, config, num_workers)
850 trainer_config=config,
851 num_workers=num_workers,
→ 852 logdir=self.logdir)
853

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/evaluation/worker_set.py in init(self, env_creator, validate_env, policy_class, trainer_config, num_workers, logdir, _setup)
83 remote_spaces = ray.get(self.remote_workers(
84 )[0].foreach_policy.remote(
—> 85 lambda p, pid: (pid, p.observation_space, p.action_space)))
86 spaces = {
87 e[0]: (getattr(e[1], “original_space”, e[1]), e[2])

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/_private/client_mode_hook.py in wrapper(*args, **kwargs)
87 if func.name != “init” or is_client_mode_enabled_by_default:
88 return getattr(ray, func.name)(*args, **kwargs)
—> 89 return func(*args, **kwargs)
90
91 return wrapper

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/worker.py in get(object_refs, timeout)
1621 raise value.as_instanceof_cause()
1622 else:
→ 1623 raise value
1624
1625 if is_individual_id:

RayActorError: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=105003, ip=172.16.35.94)
AttributeError: ‘NoneType’ object has no attribute ‘config’

During handling of the above exception, another exception occurred:

ray::RolloutWorker.init() (pid=105003, ip=172.16.35.94)
File “/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 564, in init
devices = get_tf_gpu_devices()
File “/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/utils/tf_ops.py”, line 53, in get_gpu_devices
devices = tf.config.experimental.list_physical_devices()
AttributeError: ‘NoneType’ object has no attribute ‘config’
(pid=105003) 2021-10-15 21:07:06,987 ERROR worker.py:428 – Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=105003, ip=172.16.35.94)
(pid=105003) AttributeError: ‘NoneType’ object has no attribute ‘config’
(pid=105003)
(pid=105003) During handling of the above exception, another exception occurred:
(pid=105003)
(pid=105003) ray::RolloutWorker.init() (pid=105003, ip=172.16.35.94)
(pid=105003) File “/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 564, in init
(pid=105003) devices = get_tf_gpu_devices()
(pid=105003) File “/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/utils/tf_ops.py”, line 53, in get_gpu_devices
(pid=105003) devices = tf.config.experimental.list_physical_devices()
(pid=105003) AttributeError: ‘NoneType’ object has no attribute ‘config’

mannyv · October 15, 2021, 9:18pm

Tensorflow is not installed.

(pid=105003) devices = tf.config.experimental.list_physical_devices()
(pid=105003) AttributeError: ‘NoneType’ object has no attribute ‘config’

``

mingjunwang88 · October 15, 2021, 9:42pm

@mannyv : I just install tensorflow. But still seeing this at the beginning:
(pid=120014) 2021-10-15 21:38:47,045 ERROR worker.py:428 – Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=120014, ip=172.16.35.94)
(pid=120014) File “/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 587, in init
(pid=120014) seed=seed)
(pid=120014) File “/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/evaluation/rollout_worker.py”, line 1383, in _build_policy_map
(pid=120014) conf, merged_conf)
(pid=120014) File “/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/policy/policy_map.py”, line 123, in create_policy
(pid=120014) sess = self.session_creator()
(pid=120014) File “/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/ray/rllib/evaluation/worker_set.py”, line 316, in session_creator
(pid=120014) return tf1.Session(
(pid=120014) AttributeError: ‘NoneType’ object has no attribute ‘Session’

Why this time it is tf1?

mingjunwang88 · October 16, 2021, 3:14pm

@mannyv : it actually works now. I believe it was something wrong with the installation of tensorflow. Thanks of the help!

Topic		Replies	Views
Getting errors while using documentation sample codes Debugging and performance tuning	0	74	April 22, 2024
Ray not finding available GPU on Windows RLlib	4	999	September 6, 2021
Error running intro code RLlib	2	148	March 21, 2024
'NoneType' object has no attribute 'global_worker' RLlib	2	887	July 18, 2021
Ray.rllib.agents.ppo missing RLlib	3	7588	March 27, 2023

Error when run PPOTrainer

Hi @mannyv : thanks for the responding! I did run the same script om Mac. it ran through. But I am running it on a AWS sagemaker instance. I installed as you did: pip install -u ray[all]. But it just failed before. Here is the complete log:

Related topics