I’m using MLflow with RLlib + Tune, and the plot to compare the training curves (e.g. mean reward) seems to be incorrect.
The step counts should be all lined up, similarly to Tensorboard:
Also, it’s not clear what “step” refers to in MLflow, while it’s clear in TB that it’s the timesteps.
The terminal logs look like this:
+-------------------+----------+---------------------+--------+------------------+--------+-----------+----------------------+----------------------+--------------------+
| Trial name | status | loc | iter | total time (s) | ts | reward | episode_reward_max | episode_reward_min | episode_len_mean |
|-------------------+----------+---------------------+--------+------------------+--------+-----------+----------------------+----------------------+--------------------|
| trial-2e3a6_00000 | RUNNING | 192.168.1.74:395169 | 5 | 1241.54 | 300000 | -0.671691 | -0.498597 | -0.844548 | 2880 |
| trial-2e3a6_00001 | RUNNING | 192.168.1.74:395170 | 5 | 1249.21 | 300000 | -0.678623 | -0.552662 | -0.798805 | 2880 |
| trial-2e3a6_00002 | RUNNING | 192.168.1.74:395144 | 5 | 1248.3 | 300000 | -0.674146 | -0.513186 | -0.821598 | 2880 |
| trial-2e3a6_00003 | RUNNING | 192.168.1.74:395171 | 6 | 1489.13 | 360000 | -0.637731 | -0.452467 | -0.802104 | 2880 |
+-------------------+----------+---------------------+--------+------------------+--------+-----------+----------------------+----------------------+--------------------+