**How severe does this issue affect your experience of using Ray?**

- High: It blocks me to complete my task.

I’m currently working on using a DQN algorithm to train an autonomous vehicles using Carla.

I am looking to combine an image based observation with continuous observations. I am currently using a Dict space having two Box spaces.

```
def get_observation_space(self):
"""
Set observation space as location of vehicle im x,y starting at (0,0) and ending at (1,1)
:return:
"""
spaces = {
'values': Box(low=np.array([0,0,0,0,0,0,0]), high=np.array([1,1,1,1,1,1,50]), dtype=np.float32),
'depth_camera': Box(low=0, high=256,shape=(240,320,3), dtype=np.float32),
}
obs_space = Dict(spaces)
return obs_space
```

In this case,

- Is the image observation flattened and therefore no meaningful information can be extracted by the algorithm?

All the examples I have looked at only make use of an image observation, feeding it directly to the algorithm using a Box space.

```
def get_observation_space(self):
"""
Set observation space as location of vehicle im x,y starting at (0,0) and ending at (1,1)
:return:
"""
obs_space = Box(low=0, high=256,shape=(240,320,3), dtype=np.float32)
return obs_space
```

In this case,

2. How does a DQN algorithm process an image observation?

3. Can I customize how the algorithm processes the image observation?

Any help is appreciated and thank you for your time!