How severe does this issue affect your experience of using Ray?
High: It blocks me from completing my task.
Hi,
AIM:- In the forward, I want ch_act to be discrete values and ch_msg to be continuous/float, but I am getting both as the float. I don’t want actions to be flattened, How do I achieve this?
I believe the pre/post processors will flatten the space. I use Abmarl’s RavelDiscreteWrapper when I want to ensure that something is treated discrete. This wrapper will convert the space to Discrete (in your case, Discrete(216) since there are 216 combinations in MultiDiscrete[6, 6, 6])), so that it will be treated as one-hot-encoding in the flattening.
The issue I think you might be running into here is that the actions are not included as input to the policy network. It is the task of the policy network to take the observations as input and produce the actions. The actions should be the output of the forward method.
You could for example add a ViewRequirements to provide the actions from the previous time step as input.