HI,
I used ray with a minikube instance and i tried to run TuneGridSearchCV but i get this error:
File "/tmp/ray/session_2024-01-11_08-00-20_576465_8/runtime_resources/pip/80e6243b51325ad06b7c8e2a691265dea3d56d75/virtualenv/lib/python3.8/site-packages/tune_sklearn/tune_basesearch.py", line 533, in _fit
self.analysis_ = self._tune_run(X, y, config, resources_per_trial,
File "/tmp/ray/session_2024-01-11_08-00-20_576465_8/runtime_resources/pip/80e6243b51325ad06b7c8e2a691265dea3d56d75/virtualenv/lib/python3.8/site-packages/tune_sklearn/tune_gridsearch.py", line 314, in _tune_run
analysis = tune.run(trainable, **run_args)
File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/tune/tune.py", line 1164, in run
return ExperimentAnalysis(
File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/tune/analysis/experiment_analysis.py", line 125, in __init__
raise ValueError(
ValueError: No experiment checkpoint file of form 'experiment_state-*.json' was found at: (local, /home/ray/ray_results/_Trainable_2024-01-11_08-53-06)
Please check if you specified the correct experiment path, which should be a combination of the `storage_path` and `name` specified in your run.```
Then i tried to add parameter : local_dir = '/home/reda/ray/ray_results' where it's actually stored, but i get a new errror:
``` Status message: Job entrypoint command failed with exit code 1, last available logs (truncated to 20,000 chars):
File "/tmp/ray/session_2024-01-11_08-00-20_576465_8/runtime_resources/pip/80e6243b51325ad06b7c8e2a691265dea3d56d75/virtualenv/lib/python3.8/site-packages/tune_sklearn/tune_basesearch.py", line 627, in fit
return self._fit(X, y, groups, tune_params, **fit_params)
File "/tmp/ray/session_2024-01-11_08-00-20_576465_8/runtime_resources/pip/80e6243b51325ad06b7c8e2a691265dea3d56d75/virtualenv/lib/python3.8/site-packages/tune_sklearn/tune_basesearch.py", line 536, in _fit
self.cv_results_ = self._format_results(self.n_splits, self.analysis_)
File "/tmp/ray/session_2024-01-11_08-00-20_576465_8/runtime_resources/pip/80e6243b51325ad06b7c8e2a691265dea3d56d75/virtualenv/lib/python3.8/site-packages/tune_sklearn/tune_basesearch.py", line 778, in _format_results
trial_dfs = out.fetch_trial_dataframes()
File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/tune/analysis/experiment_analysis.py", line 746, in fetch_trial_dataframes
raise DeprecationWarning(
DeprecationWarning: `fetch_trial_dataframes` is deprecated. Access the `trial_dataframes` property instead.
I don’t understand why it can’t find the right path, also i tried to run ray without minikube instance and it works perfectly, maybe this related to minikube.
here the code i used with minikube :
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm install kuberay-operator kuberay/kuberay-operator
helm install raycluster kuberay/ray-cluster
kubectl port-forward service/raycluster-kuberay-head-svc 8265:8265
ray job submit --runtime-env-json=‘{“working_dir”: “./”, “pip”:[ “ray[tune]”, “numpy”, “joblib”, “scikit-learn”, “tune-sklearn”, “mlflow”]}’ --address=“http://127.0.0.1:8265” – python script_tune.py