Structure's sequence length mismatch issue from sgd code for PPO policy

Siddharth_Jain · April 9, 2023, 6:40pm

How severe does this issue affect your experience of using Ray?

High: It blocks me to complete my task.

I am trying to train PPO policy in a custom environment. After training for around 50 iterations the following error is thrown:

ERROR trial_runner.py:1088 -- Trial experiment_HierarchicalGraphColorEnv_bc37e_00000: Error processing event.
ray.exceptions.RayTaskError(ValueError): e[36mray::ImplicitFunc.train()e[39m (pid=8039, ip=172.10.3.120, repr=experiment)
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/tune/trainable/trainable.py", line 367, in train
    raise skipped from exception_cause(skipped)
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/tune/trainable/function_trainable.py", line 338, in entrypoint
    self._status_reporter.get_checkpoint(),
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/tune/trainable/function_trainable.py", line 652, in _trainable_func
    output = fn()
  File "/home/venkatakeerthy.cs.iith/ML-Register-Allocation/model/RegAlloc/ggnn_drl/rllib_split_model/src/experiment_ppo.py", line 59, in experiment
    train_results = train_agent.train()
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/tune/trainable/trainable.py", line 367, in train
    raise skipped from exception_cause(skipped)
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/tune/trainable/trainable.py", line 364, in train
    result = self.step()
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/rllib/algorithms/algorithm.py", line 749, in step
    results, train_iter_ctx = self._run_one_training_iteration()
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/rllib/algorithms/algorithm.py", line 2623, in _run_one_training_iteration
    results = self.training_step()
  File "/home/venkatakeerthy.cs.iith/ML-Register-Allocation/model/RegAlloc/ggnn_drl/rllib_split_model/src/ppo_new.py", line 379, in training_step
    train_results = train_one_step(self, train_batch)
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/rllib/execution/train_ops.py", line 62, in train_one_step
    [],
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/rllib/utils/sgd.py", line 135, in do_minibatch_sgd
    learner_info = learner_info_builder.finalize()
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/ray/rllib/utils/metrics/learner_info.py", line 87, in finalize
    _all_tower_reduce, *results_all_towers
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/tree/__init__.py", line 550, in map_structure_with_path
    **kwargs)
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/tree/__init__.py", line 841, in map_structure_with_path_up_to
    shallow_structure, input_tree, check_types=check_types)
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/tree/__init__.py", line 684, in _assert_shallow_structure
    shallow_branch, input_branch, check_types=check_types)
  File "/home/users/anaconda3/envs/conda-env/lib/python3.7/site-packages/tree/__init__.py", line 664, in _assert_shallow_structure
    shallow_length=_num_elements(shallow_tree)))
ValueError: The two structures don't have the same sequence length. Input structure has length 10, while shallow structure has length 11.

I recently upgraded the Ray version from 1.4 to 2.2.0. In the later version of ray (2.2.0) the code related to sgd result info is changed, and the error is occurring in the changed code only.

Any help to understand the issue better or towards fixing it is really appreciated.

Siddharth_Jain · January 19, 2024, 10:59am

I am still facing this issue, can someone help with this?

Lars_Simon_Zehnder · January 19, 2024, 12:01pm

@Siddharth_Jain could you provide a reproducable example? I can take a look into it.

Topic		Replies	Views
Correct implementation for PPO reset_config() RLlib	1	192	April 7, 2024
Issue with Checkpointing in Ray 2.9.1 on Windows 11 while Training PPO Algorithm Checkpointing, Restoring	1	237	January 30, 2024
Nan in the policy network after training for longer duration Configure Algorithm, Training, Evaluation, Scaling	0	254	October 13, 2023
Missing 'grad_gnorm' key in some `input_trees` after some training time RLlib	23	2219	January 29, 2023
Getting errors while using documentation sample codes Debugging and performance tuning	0	74	April 22, 2024

Structure's sequence length mismatch issue from sgd code for PPO policy

Related topics