CQL for discrete action space


is there a specific reason why CQL is available only for a continuous action space? I have an offline RL usecase where I have only access to a fixed MDP dataset with discrete actions and no access to a simulator. Can I somehow use the Rllib implementation for CQL for this use case?

At the moment our CQL doesn’t support discrete spaces. You can extend either the CQL torch or tf policy to implement similar logic as the SAC tf/torch policy in order to implement discrete spaces though.

You’d have to follow one of our custom policy examples to actually implement this.

thanks for the answer, that makes things clear to me!