For box action spaces, are the logits in the form
[mean1, std1, mean2, std2, …]
or
[mean1, mean2, …, std1, std2, …]
?
For box action spaces, are the logits in the form
[mean1, std1, mean2, std2, …]
or
[mean1, mean2, …, std1, std2, …]
?
Hi, can you give a bit more context on what you’re trying to build here and what Ray library you’re using? Thanks ![]()
Hi. It seems like it has been some time, so you might have found it but here it is anyways:
it’s the second approach.
e.g.
output = torch.concat((means, stds), dim=-1)
@christina I am taking the torch model files produced from a ray train job, converting them to onnx files, and making a rust based inference program.
Thanks!