Hi, I have a question.
I have to use continuous and discrete action and a deterministic policy algorithm.
How could I do that?
One way is to convert the discrete action space index into a continuous vector like what they did in Muzero.
Another way is to add an environment wrapper that represents a completely continuous action space and the wrapper can handle the continuous thresholding for discrete actions.