How severe does this issue affect your experience of using Ray?
- None: Just asking a question out of curiosity
I’m wondering what is the best strategy to make a single custom gym environment run faster.
Is there any Ray feature that can be used for this purpose?
Is Numba (cuda) a good option?
Please let me know any recommendation about this topic.
Ray RLlib itself only supports speedup through parallelization or vectorization.
The things you can come up with to speed up your sampling depend very much on the environment itself. RLlib treats the environment mostly as a black-box behind the Gym API. So what happens behind this API and how fast it is, is entirely up to the user. If you simulation is extremely costly and you can only sample from one or two simulators, you will likely want to put the networks on a GPU.
Here are two resources for simpler environments:
Torch implementation of Cartpole
Complete RL loop on a GPU
Have you tried profiling your simulation?
Thank you for your answer @arturn. I’ll check both links, maybe I can apply something to my use case.
@rusu24edward I did. In one of the environments the problem is a 300 iterations for loop, that I could not come up with an optimized way to do it. The other custom environments are simpler. Anyway, the environments need to be optimized as I have a budget to spend with training. Any recommendation is welcome. Thanks.
I’ve not used numba myself, but I’ve looked into it and heard good things about it, so it’s probably a good place to start.