From stable-baselines3 to ray rl

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

Hi all. I have been trying out RL problems with the help of sb3. Given the interesting opportunities, ray has to offer I want to try it for a PPO problem. But I find it very complex:
My env has Dict observations, Discrete actions. I use a standardization tool for each key of the dict provided by sb3: VecNormalization wrapper. As a policy, I have a custom feature extractor for each key on the dict. Then I concat the result and feed a separated value and policy functions. (Can’t wait to take advantage of Attention and other resources built into Ray!).
How can I switch to ray? I can see that the way I work with sb3 is very different from what ray examples and doc show.

Hey @upi , thanks for the question!

Here are two example scripts describing the move from SB3 to RLlib:

Hope these help. We should also add a migration guide on how to move to our docs.

Hey @sven1977 those scripts came in handly but won’t be enough to move a complex environment like mine but I will try it anyway. If I find a good way to go I will be pleased to share it with the Ray community

1 Like