[rllib]REPRODUCIBILITY: Is current implementation is enough?

Hi,

According to the cublas (sec 2.4) official document, we should set the variable as:
os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"


For cuda with version > 10.2, the current strategy for reproducible results is not enough.
It only sets os.environ["CUBLAS_WORKSPACE_CONFIG"] = "4096:8", and does not call
torch.use_deterministic_algorithms(True).

I cannot understand what is the intention behind this implementation. I think it is not complete.