I am running Ray Tune in local model “ray.init(local_mode=True)” but after running the first trial, I get the following error:
botocore.exceptions.NoCredentialsError: Unable to locate credentials
It seems that botocore is a library for Amazon web services. However, I don’t have any reference to Amazon services anywhere, and I am not even having an account there.
How to say Ray Tune not to use botocore at all, thus not trying to connect to Amazon web services? A strange thing is that the code worked earlier but that error started to appear recently.
EDIT. This is probably related to Ray Tune mlflow integration (@mlflow_mixin), or maybe just an mlflow issue. But anyway, still unsolved.
Not in a laptop but in an on-premises computer, yes. Looks like that helped, when I removed ‘local_mode=True’ it is no more invoking botocore! Thank you very much!
Well, after a bit more testing I noticed that mlflow reporting stopped working. So the code now runs without that ‘local_mode=True’ -setting, but I don’t get any reports into the mlflow server then. Whereas with that ‘local_mode=True’ the mlflow reporting works but then the ending of first trial fails due to the botocore problem I described originally.
But yes, I removed MLflowLoggerCallback and that way get rid of the “botocore” problem. Maybe that callback is somehow accidentally hard-coded for AWS (Amazon) services, don’t know.
However, what then happens is that after the second trial, the program just hangs and never continues. I can see the results of the first trial in the mlflow UI, but for the second trial only the parameters are reported, and not the measures. Probably because the trial hangs just in that point, when it is trying to report the measures.
That hanging does not happen if I remove “local_mode=True” from the init call. I can survive also that way, but then there’s a new problem, again: The experiment name is somehow lost, and all mlflow logging goes into “Default” category instead of appearing under the “experiment_name”. I set the experiment name before tune.run() call with mlflow.set_experiment(experiment_name) function. Need to study more to find out why the experiment name is lost somewhere inside the tune, any ideas?
My suggestion is to stay away from local_mode=True for now.
To debug the experiment_name issue, could you add some logging around setup_mlflow.py’s setup_mlflow method. Basically I wonder if self._mlflow.get_experiment(experiment_id=experiment_id) can give you the correct experiment you already created in driver code.