I’m currently relying on tune.Tuner
to run my experiment on a machine that has 28 CPUs and 2 GPUs. Being that I’m not the only one who has access to this machine, I’d like to restrict my experiment to a single GPU.
Despite specifying with_resources(Trainer, {"cpu": 1, "gpu": 1})
both GPUs are used. The only way to avoid this is by setting os.environ["CUDA_VISIBLE_DEVICES"] = "1"
.
Is there a way to achieve my goal without explicitly setting an environment variable myself? If I understand correctly, according to the documentation this should be taken care of by tune.with_resources
:
To leverage GPUs, you must set
gpu
intune.with_resources(trainable, resources_per_trial)
. This will automatically setCUDA_VISIBLE_DEVICES
for each trial.
run_config = RunConfig(
stop={"training_iteration": epochs},
checkpoint_config=ck_config,
name=f"{model_name}_{exp_details}",
local_dir=str(Path(__file__).parent / "ray_checkpoints")
)
tuner = Tuner(
trainable= with_resources(Trainer, {"cpu": 1, "gpu": 1}),
run_config=run_config,
tune_config=TuneConfig(mode="min", metric="val_loss", num_samples=5),
param_space=configuration,
)