Hello, I’m trying to run hyperparameter tuning via Tuner.fit() and I’m encountering an error because all of my model logic is written in a custom python module on my local machine that can’t be found in the workers. I’ve tried specifying the local_dir parameter of the Tuner run_config parameter, but that did not fix the issue. I also tried importing all necessary classes and functions within my train function, but that’s not working.
Any references or suggestions?
Yep, here’s the error:
Traceback (most recent call last):
File "python/ray/_raylet.pyx", line 1883, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 1984, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 1889, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 1830, in ray._raylet.execute_task.function_executor
File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/ray/_private/function_manager.py", line 724, in actor_method_executor
return method(__ray_actor, *args, **kwargs)
File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/ray/_private/function_manager.py", line 636, in temporary_actor_method
raise RuntimeError(
RuntimeError: The actor with name ImplicitFunc failed to import on the worker. This may be because needed library dependencies are not installed in the worker environment:
Traceback (most recent call last):
File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/ray/_private/function_manager.py", line 675, in _load_actor_class_from_gcs
actor_class = pickle.loads(pickled_class)
ModuleNotFoundError: No module named 'randomwalk_analysis'
And if it helps, here’s a snippet of my raytune training setup. I’m effectively running this code in a main.py and importing my train function from a submodule within randomwalk_analysis. The train function utilizes a variety of datasets, dataloaders, and torch models also defined in in randomwalk_analysis.model.
from randomwalk_analysis.model import train_model_raytune
trainable = tune.with_parameters(train_model_raytune, *other parameters*)
trainable_with_cpu_gpu = tune.with_resources(
trainable,
{"cpu": args.num_workers, "gpu": args.num_gpus}
)
tuner = tune.Tuner(
trainable_with_cpu_gpu,
param_space=param_space,
tune_config=tune.TuneConfig(
num_samples=args.num_samples,
scheduler=ASHAScheduler(
metric="val_loss",
mode="min",
max_t=args.max_epochs,
grace_period=args.grace_period,
reduction_factor=args.reduction_factor
)
),
run_config=RunConfig(local_dir=*absolute path to local directory containing randomwalk_analysis*)
)
result = tuner.fit()