Add step info dictionary to MLflowLoggerCallback with Tune

How severe does this issue affect your experience of using Ray?

  • Medium: It contributes to significant difficulty to complete my task, but I can work around it.

I am using RLlib with ML Flow, by using the MLflowLoggerCallback and passing it to

However, this makes the callback log only a few metrics that are generated by RLlib.

In my environment, I return several metrics of interest in the info returned at each environment step. How can I have them logged by MLFlow?

Hi @fedetask,

You need to add a callback that adds your custom metrics.

Here is an explanation:

Here is an example:

So I should have both a CustomCallback and the MLflowLoggerCallback, right? And the custom callback will add the info to the custom_metrics, so they should be passed to in this order:[CustomCallback(), MLflowLoggerCallback()]) 

am I correct?


I do not think that is quite how it should work, but I could be wrong. In my understanding, the MLflowLoggerCallback is a tune callback and the CustomCallback is an rllib callback. I think it should look something like this.{...,  "callbacks": CustomCallback,},

@xwjiang2010 can you please chime in here

@mannyv’s understanding of the callbacks is correct. @fedetask did you get a chance to try it?

Unfortunately, I am using Ray version 1.8.0; I cannot upgrade it for now.

I did as @mannyv described but things still don’t work. I think the reason is that in Ray 1.8.0, the MLflowLoggerCallback in, lines 148-160, logs stuff as follows:

148    def log_trial_result(self, iteration: int, trial: "Trial", result: Dict):
149        step = result.get(TIMESTEPS_TOTAL) or result[TRAINING_ITERATION]
150        run_id = self._trial_runs[trial]
151        for key, value in result.items():
152            try:
153                value = float(value)
154            except (ValueError, TypeError):
155                logger.debug("Cannot log key {} with value {} since the "
156                             "value cannot be converted to float.".format(
157                                 key, value))
158                continue
159            self.client.log_metric(
160                run_id=run_id, key=key, value=value, step=step)

and for key='custom_metrics', value is a dictionary that cannot be cast to float in line 153 and therefore isn’t logged.

I solved it by creating a new class that extends MLflowLoggerCallback and overriding the log_trial_result to allow for the custom_metrics dictionary to be logged.


(post deleted by author)