Custom Algorithm

Hi there,
I have to port an algorithm (decision trees + Q_learning) and use it on a multi_agent_env combined with PPO (n agents will be trained with PPO, one with dt+q_l). How should I port the algorithm into RLlib? I have no clue.

To use different algorithms I’ll use this as an example.

p.s. the env already works with ppo in rllib.
p.p.s. rllib 0.8.4

You can define your own algorithm by inheriting from the Algorithm and AlgorithmConfig class and defining a policy.
Here is SimpleQ as an example to get started on this. You can probably inherit from the SimpleQConfig and Policy to have some code structure to start with.

On first attempts to execute your code, you can simply start ray in local mode (ray.init(local_mode=True)), instantiate the AlgorithmObject and start debugging.

Have fun! :slight_smile: