How to avoid packaging when performing hyperparameter tuning on local machine?

1. Severity of the issue: (select one)
High: Completely blocks me.

2. Environment:

  • Ray version: 2.49.2
  • Python version: 3.12.2
  • OS: Mac
  • Cloud/Infrastructure: none
  • Other libs/tools (if relevant): none

3. What happened vs. what you expected:

  • Expected: Skip packaging or provide an interface to exclude certain path when running Tuner on local machine
  • Actual: i couldn’t find a way to configure that

My use case:

        tuner = tune.Tuner(
            trainable,
            tune_config=tune.TuneConfig(
                num_samples=self.num_samples,
                max_concurrent_trials=self.max_concurrent_trials,
            ),
            run_config=tune.RunConfig(
                failure_config=tune.FailureConfig(fail_fast=True),
                stop=self.stopper,
                callbacks=callbacks,
                checkpoint_config=tune.CheckpointConfig(
                    checkpoint_frequency=0,
                ),
            ),
            param_space=pspace,
        )
        tuner.fit()

And i run by RAY_CHDIR_TO_TRIAL_DIR=0 PYTHONPATH=. uv run python workspace/tune.py. It attempts to package the whole workspace dir but failed since there is a dataset dir workspace/data which is very huge.

I couldn’t find a way to prohibit the packaging behaviour nor to exclude certain path from packaging. I’ve tried to call ray.init(runtime_env={‘excludes’=[‘data/’]}) but the program said it’s a duplicate initialization, i guess it’s caused by the Tuner.

So is there any workaround to this problem? Thanks.

Ray Tune’s Tuner automatically packages the current working directory (including large folders like workspace/data) when running locally, and there is currently no direct interface in Tuner or RunConfig to exclude specific paths from this packaging process. The excludes option in ray.init(runtime_env={...}) is not supported for Tuner’s internal packaging, and duplicate initialization errors occur if you try to call ray.init() after Tuner has already initialized Ray. This limitation is a known issue and there is no built-in way to skip or customize the packaging/exclusion behavior for local runs as of Ray 2.49.2.

As a workaround, you can try running your script from a directory that does not contain the large data folder, or temporarily move/rename the data directory before running Tuner. For more details and discussion, see the related issue in the Ray repository: Ray GitHub Issue #51432.

Would you like more detail or suggestions for alternative workflows?

Sources:

Hint: Mention @RunLLM in the post for followups.