So I should have both a CustomCallback and the MLflowLoggerCallback, right? And the custom callback will add the info to the custom_metrics, so they should be passed to tune.run() in this order:
I do not think that is quite how it should work, but I could be wrong. In my understanding, the MLflowLoggerCallback is a tune callback and the CustomCallback is an rllib callback. I think it should look something like this.
Hello,
Unfortunately, I am using Ray version 1.8.0; I cannot upgrade it for now.
I did as @mannyv described but things still don’t work. I think the reason is that in Ray 1.8.0, the MLflowLoggerCallback in mlflow.py, lines 148-160, logs stuff as follows:
148 def log_trial_result(self, iteration: int, trial: "Trial", result: Dict):
149 step = result.get(TIMESTEPS_TOTAL) or result[TRAINING_ITERATION]
150 run_id = self._trial_runs[trial]
151 for key, value in result.items():
152 try:
153 value = float(value)
154 except (ValueError, TypeError):
155 logger.debug("Cannot log key {} with value {} since the "
156 "value cannot be converted to float.".format(
157 key, value))
158 continue
159 self.client.log_metric(
160 run_id=run_id, key=key, value=value, step=step)
and for key='custom_metrics', value is a dictionary that cannot be cast to float in line 153 and therefore isn’t logged.
I solved it by creating a new class that extends MLflowLoggerCallback and overriding the log_trial_result to allow for the custom_metrics dictionary to be logged.