[HIGH] TypeError: policy_mapping_fn() takes 1 positional argument but 2 were given

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

I am running into the following error (which seems to be stemming from a function call within the package version: ray 2.8.1 pypi_0 pypi) and am not sure what I need to do to fix this. Any suggestions would be greatly appreciated.

PPO pid=14214) 2023-12-03 19:57:56,079	INFO algorithm_config.py:3679 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also then want to set eager_tracing=True in order to reach similar execution speed as with static-graph mode.
(PPO pid=14214) Trainable.setup took 17.078 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
(PPO pid=14214) Install gputil for GPU system monitoring.
(PPO pid=14214) 2023-12-03 19:57:54,368	INFO dynamic_tf_policy_v2.py:710 -- Adding extra-action-fetch `action_prob` to view-reqs. [repeated 2x across cluster]
(PPO pid=14214) 2023-12-03 19:57:54,368	INFO dynamic_tf_policy_v2.py:710 -- Adding extra-action-fetch `action_logp` to view-reqs. [repeated 2x across cluster]
(PPO pid=14214) 2023-12-03 19:57:54,369	INFO dynamic_tf_policy_v2.py:710 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs. [repeated 2x across cluster]
(PPO pid=14214) 2023-12-03 19:57:54,369	INFO dynamic_tf_policy_v2.py:710 -- Adding extra-action-fetch `vf_preds` to view-reqs. [repeated 2x across cluster]
(PPO pid=14214) 2023-12-03 19:57:54,369	INFO dynamic_tf_policy_v2.py:722 -- Testing `postprocess_trajectory` w/ dummy batch. [repeated 2x across cluster]
(RolloutWorker pid=14263) 2023-12-03 19:57:56,310	INFO rollout_worker.py:690 -- Generating sample batch of size 100
2023-12-03 19:57:56,446	ERROR tune_controller.py:1383 -- Trial task failed for trial PPO_meltingpot_32fc7_00000
Traceback (most recent call last):
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
    result = ray.get(future)
             ^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 24, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/_private/worker.py", line 2563, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(TypeError): ray::PPO.train() (pid=14214, ip=192.168.1.15, actor_id=17553ad788b3f5db04b48e7d01000000, repr=PPO)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/tune/trainable/trainable.py", line 342, in train
    raise skipped from exception_cause(skipped)
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/tune/trainable/trainable.py", line 339, in train
    result = self.step()
             ^^^^^^^^^^^
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/algorithms/algorithm.py", line 853, in step
    results, train_iter_ctx = self._run_one_training_iteration()
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/algorithms/algorithm.py", line 2854, in _run_one_training_iteration
    results = self.training_step()
              ^^^^^^^^^^^^^^^^^^^^
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/algorithms/ppo/ppo.py", line 429, in training_step
    train_batch = synchronous_parallel_sample(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/execution/rollout_ops.py", line 85, in synchronous_parallel_sample
    sample_batches = worker_set.foreach_worker(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/worker_set.py", line 680, in foreach_worker
    handle_remote_call_result_errors(remote_results, self._ignore_worker_failures)
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/worker_set.py", line 76, in handle_remote_call_result_errors
    raise r.get()
ray.exceptions.RayTaskError(TypeError): ray::RolloutWorker.apply() (pid=14263, ip=192.168.1.15, actor_id=c07aa3e62e7bae9ab5f2e48301000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f3cc999c550>)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/utils/actor_manager.py", line 185, in apply
    raise e
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/utils/actor_manager.py", line 176, in apply
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/execution/rollout_ops.py", line 86, in <lambda>
    lambda w: w.sample(), local_worker=False, healthy_only=True
              ^^^^^^^^^^
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/rollout_worker.py", line 696, in sample
    batches = [self.input_reader.next()]
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/sampler.py", line 92, in next
    batches = [self.get_data()]
               ^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/sampler.py", line 277, in get_data
    item = next(self._env_runner)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/env_runner_v2.py", line 344, in run
    outputs = self.step()
              ^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/env_runner_v2.py", line 370, in step
    active_envs, to_eval, outputs = self._process_observations(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/env_runner_v2.py", line 536, in _process_observations
    policy_id: PolicyID = episode.policy_for(agent_id)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nathan/anaconda3/envs/mp/lib/python3.11/site-packages/ray/rllib/evaluation/episode_v2.py", line 120, in policy_for
    policy_id = self._agent_to_policy[agent_id] = self.policy_mapping_fn(
                                                  ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: get_config.<locals>.policy_mapping_fn() takes 1 positional argument but 2 were given
(PPO pid=14214) 2023-12-03 19:57:56,441	ERROR actor_manager.py:500 -- Ray error, taking actor 1 out of service. ray::RolloutWorker.apply() (pid=14263, ip=192.168.1.15, actor_id=c07aa3e62e7bae9ab5f2e48301000000, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x7f3cc999c550>)