How do I troubleshoot "The two structures don't have the same nested structure"?

How severe does this issue affect your experience of using Ray?

  • High: It blocks me to complete my task.

Hi there. RLlib newbie here, still learning the ropes. I went through the tutorial notebook, played with it, got over some difficulties. I feel like I’m starting to get the hang of RLlib.

I was writing an experiment, and when I tried to run it, I got this error:

The two structures don't have the same nested structure.

First structure: type=Tensor str=Tensor("policy_0_wk2/obs:0", shape=(?, 3), dtype=int64)

Second structure: type=tuple str=(array([5]), array([2]), array([1]))

More specifically: Substructure "type=tuple str=(array([5]), array([2]), array([1]))" is a sequence, while substructure "type=Tensor str=Tensor("policy_0_wk2/obs:0", shape=(?, 3), dtype=int64)" is not
Entire first structure:
.
Entire second structure:
(., ., .)

I tried to understand why this is happening using the debugger. My thinking was: “RLlib is expecting two structures to have the same shape, and they don’t. It’s probably because I misconfigured something, and now I just need to figure out what it was. I’ll step through the code that raised the exception, look at the stack, see what the two structures are, and then probably change one of them to match the other.”

This approach did not work for me, probably because of the way that RLlib and Tensorflow are built. When I’m looking at the stack that resulted in this error, I’m not seeing any connection between it and any code that I’ve written in my experiment. I have no way of knowing which of the structures I’m responsible for.

So I have 2 questions here:

  1. What is causing my current problem, and how do I fix it?
  2. When I stumble on such exceptions in the future, how can I troubleshoot them without asking for help?

Thanks for your help,
Ram Rachum.

Hi @cool-RR,

Line 26 is making Tuple observations. Try converting them to a list or numpy array instead.

1 Like

You’re right. Thank you. I do still want to know how to be able to troubleshoot these issues…

I started seeing this issue a lot since ray 1.10 ish. I fixed it by explicitly defining the data types of all numpy arrays to match what I set in the space. For example, in the past I could get away with np.array([1, 0, 2]) for a space like Box(0, 2, (3,), dtype=int), but now I have to explicitly write np.array([1, 0, 2], dtype=int)

1 Like

I faced today the same topic and tried around some time, already following the explicit dtype hint by @rusu24edward .

However, I would like to give further highlight on the info dict, which may be returned as an additional element from step() or reset(). In that case, calling those methods via e.g. compute_single_action gives a tuple, not the observation elements only. Do not forget to escape the info dict, as shown in the following example:

obs, _ = test_env.reset()
action = trainer.compute_single_action(obs)
obs, reward, done, truncated, info = test_env.step(action)