Hello colleagues!
I have my custom environment with action_space: Box([-600. -600. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.], [ 600. 600. 600. 520. 520. 520. 520. 2880. 1920. 520. 600. 600. 520. 520. 520. 520. 2880. 1920. 520. 600.], (20,), float32), but function forward_inference returns me actions in the interval (-1,1) for all 20 dimensions. For forward_exploration function I have actions in the interval (-5;5) more or less for all 20 dimensions, but 18 from 20 actions should be positive.
Should DreamerV3 algorithm extrapolate these actions to the environment action space according to some parameter, that I have missed, or I have to extrapolate actions myself?
Thank you in advance!
Kind regards, Alex.