How severe does this issue affect your experience of using Ray?
- High: It blocks me to complete my task.
I am new to Ray, and using it for RL problem.
But I am facing a series of issues now.
- When I cloned the ray from github,
ppo.py and others import from “ray.util” and “ray.tune” which are not in my cloned package.
Are there located in different package?
2)If I install Ray (using pip), then I am seeing a different error:
(RolloutWorker pid=28860) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\gym\spaces\box.py”, line 156, in contains
(RolloutWorker pid=28860) x = np.asarray(x, dtype=self.dtype)
(RolloutWorker pid=28860) TypeError: float() argument must be a string or a number, not ‘dict’
The whole log message is attached below:
Because I cannot debug with the installed one, I cannot track the cause of this problem.
In fact, in a previous attempt, I saw a different error, but because of that error, I uninstalled and cloned Ray, but it did not work as shown in 1), so, reinstalled Ray to copy the log for asking question, and I am looking at this error above.
Minor)
I am not sure what to do to address the warning message:
DeprecationWarning: the imp module is deprecated in favour of importlib; see the module’s documentation for alternative uses
import imp, sys, os
I attached the log:
The above error has been found in your environment! We’ve added a module for checking your custom environments. It may cause your experiment to fail if your environment is not set up correctly. You can disable this behavior by setting disable_env_checking=True
in your environment config dictionary. You can run the environment checking module standalone by calling ray.rllib.utils.check_env([env]).
(RolloutWorker pid=28244) C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\gym\spaces\box.py:155: UserWarning: WARN: Casting input x to numpy array.
(RolloutWorker pid=28244) logger.warn(“Casting input x to numpy array.”)
(RolloutWorker pid=28244) C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\gym\spaces\box.py:156: DeprecationWarning: setting an array element with a sequence. This was supported in some cases where the elements are arrays with a single element. For example np.array([1, np.array([2])], dtype=int)
. In the future this will raise the same ValueError as np.array([1, [2]], dtype=int)
.
(RolloutWorker pid=28244) x = np.asarray(x, dtype=self.dtype)
(RolloutWorker pid=28244) 2022-12-21 13:15:24,804 ERROR worker.py:763 – Exception raised in creation task: The actor died because of an error raised in its creation task, ray::RolloutWorker.init() (pid=28244, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x0000022B467EB7F0>)
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\ray\rllib\utils\pre_checks\env.py”, line 174, in check_gym_environments
(RolloutWorker pid=28244) if not env.observation_space.contains(reset_obs):
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\gym\spaces\box.py”, line 156, in contains
(RolloutWorker pid=28244) x = np.asarray(x, dtype=self.dtype)
(RolloutWorker pid=28244) TypeError: float() argument must be a string or a number, not ‘dict’
(RolloutWorker pid=28244)
(RolloutWorker pid=28244) During handling of the above exception, another exception occurred:
(RolloutWorker pid=28244)
(RolloutWorker pid=28244) ray::RolloutWorker.init() (pid=28244, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x0000022B467EB7F0>)
(RolloutWorker pid=28244) File “python\ray_raylet.pyx”, line 823, in ray._raylet.execute_task
(RolloutWorker pid=28244) File “python\ray_raylet.pyx”, line 875, in ray._raylet.execute_task
(RolloutWorker pid=28244) File “python\ray_raylet.pyx”, line 830, in ray._raylet.execute_task
(RolloutWorker pid=28244) File “python\ray_raylet.pyx”, line 834, in ray._raylet.execute_task
(RolloutWorker pid=28244) File “python\ray_raylet.pyx”, line 780, in ray._raylet.execute_task.function_executor
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\ray_private\function_manager.py”, line 674, in actor_method_executor
(RolloutWorker pid=28244) return method(__ray_actor, *args, **kwargs)
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\ray\util\tracing\tracing_helper.py”, line 466, in _resume_span
(RolloutWorker pid=28244) return method(self, *_args, **_kwargs)
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\ray\rllib\evaluation\rollout_worker.py”, line 592, in init
(RolloutWorker pid=28244) check_env(self.env)
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\ray\rllib\utils\pre_checks\env.py”, line 88, in check_env
(RolloutWorker pid=28244) raise ValueError(
(RolloutWorker pid=28244) ValueError: Traceback (most recent call last):
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\ray\rllib\utils\pre_checks\env.py”, line 77, in check_env
(RolloutWorker pid=28244) check_gym_environments(env)
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\ray\rllib\utils\pre_checks\env.py”, line 174, in check_gym_environments
(RolloutWorker pid=28244) if not env.observation_space.contains(reset_obs):
(RolloutWorker pid=28244) File “C:\Users\j.lee\Anaconda3\envs\rllib_39\lib\site-packages\gym\spaces\box.py”, line 156, in contains
(RolloutWorker pid=28244) x = np.asarray(x, dtype=self.dtype)
(RolloutWorker pid=28244) TypeError: float() argument must be a string or a number, not ‘dict’
(RolloutWorker pid=28244)
(RolloutWorker pid=28244) The above error has been found in your environment! We’ve added a module for checking your custom environments. It may cause your experiment to fail if your environment is not set up correctly. You can disable this behavior by setting disable_env_checking=True
in your environment config dictionary. You can run the environment checking module standalone by calling ray.rllib.utils.check_env([env]).
Here is the code:
import gym
from ray.rllib.algorithms.ppo import PPOConfig
class ParrotEnv(gym.Env):
def __init__(self, config):
self.action_space = config.get(
"parrot_shriek_range", gym.spaces.Box(-1.0, 1.0, shape=(1, )))
self.observation_space = self.action_space
self.cur_obs = None
self.episode_len = 0
def reset(self, *, seed=None, options=None):
"""Resets the episode and returns the initial observation of the new one.
"""
self.episode_len = 0
self.cur_obs = self.observation_space.sample()
return self.cur_obs, {}
def step(self, action):
"""Takes a single step in the episode given `action`
Returns:
New observation, reward, done-flag, info-dict (empty).
"""
self.episode_len += 1
terminated = False
truncated = self.episode_len >= 10
reward = -sum(abs(self.cur_obs - action))
self.cur_obs = self.observation_space.sample()
return self.cur_obs, reward, terminated, truncated, {}
config = (
PPOConfig()
.environment(
env=ParrotEnv,
env_config={
“parrot_shriek_range”: gym.spaces.Box(-5.0, 5.0, (1, ))
},
)
.rollouts(num_rollout_workers=3)
)
algo = config.build()
for i in range(5):
results = algo.train()
print(f"Iter: {i}; avg. reward={results[‘episode_reward_mean’]}")
env = ParrotEnv({“parrot_shriek_range”: gym.spaces.Box(-3.0, 3.0, (1, ))})
obs, info = env.reset()
terminated = truncated = False
total_reward = 0.0
while not terminated and not truncated:
action = algo.compute_single_action(obs)
obs, reward, terminated, truncated, info = env.step(action)
total_reward += reward
print(f"Shreaked for 1 episode; total-reward={total_reward}")