yj373
October 16, 2024, 4:47am
1
Hello,
I am using Ray to tune hyperparameters. The tuning progress is reported every 30 seconds by default. I want to change the update interval to 60 seconds or longer. How can I realize that?
I have tried all the examples given in Tune Console Output (Reporters) — Ray 2.43.0
However, none of them work. Has anyone met the same problem?
Here is my code:
class ExperimentTerminationReporter(CLIReporter):
def should_report(self, trials, done=False):
"""Reports only on experiment termination."""
return done
class TrialTerminationReporter(CLIReporter):
def __init__(self):
super(TrialTerminationReporter, self).__init__()
self.num_terminated = 0
def should_report(self, trials, done=False):
"""Reports only on trial termination events."""
old_num_terminated = self.num_terminated
self.num_terminated = len([t for t in trials if t.status == Trial.TERMINATED])
return self.num_terminated > old_num_terminated
search_space = {
't': tune.uniform(50, 150),
'map_weight1': tune.uniform(0, 1),
'map_weight2': tune.uniform(0, 1),
'map_weight3': tune.uniform(0, 1),
'map_weight4': tune.uniform(0, 1),
'alpha': tune.uniform(1, 10),
'beta': tune.uniform(0, 1),
}
os.environ["RAY_CHDIR_TO_TRIAL_DIR"] = "0"
tuner = tune.Tuner(
tune.with_resources(tune.with_parameters(stable_diffusion_func, args=args), resources={"cpu": args.num_cpu, "gpu": args.num_gpu}),
param_space=search_space,
tune_config=tune.TuneConfig(search_alg=OptunaSearch(), metric="f1_auc", mode="max", num_samples=args.num_runs),
run_config=RunConfig(progress_reporter=ExperimentTerminationReporter()),
# run_config=RunConfig(progress_reporter=reporter),
)
results = tuner.fit()
Per ray.tune.CLIReporter — Ray 2.37.0 the param max_report_frequency isn’t working for you @yj373 ?
yj373
October 18, 2024, 6:02am
3
Sam_Chan:
max_report_frequency
No it doesn’t. The tuning progress is still reported every 30 seconds. Here is the code script
reporter = CLIReporter()
reporter.max_report_frequency = 60
tuner = tune.Tuner(
tune.with_resources(tune.with_parameters(stable_cascade_func, args=args), resources={"cpu": args.num_cpu, "gpu": args.num_gpu}),
param_space=search_space,
tune_config=tune.TuneConfig(search_alg=OptunaSearch(), metric="f1_auc", mode="max", num_samples=args.num_runs),
run_config=RunConfig(progress_reporter=reporter),
)
Daraan
December 6, 2024, 12:18pm
4
I also have the same situation on 2.38. I’ve reduced and increased max_report_frequency
, however the report constantly is made every 30 seconds.
Daraan
February 17, 2025, 4:41pm
5
I think this is related to
opened 06:39PM - 04 Dec 24 UTC
tune
triage
docs
### Description
The documentation of ray.tune states in several pages (for ex… ample [here](https://docs.ray.io/en/latest/tune/tutorials/tune-output.html#how-to-control-console-output-with-tune)) that the console output is configurable using for example CLIReporter. However I found out that this is no longer the case unless the env variable RAY_AIR_NEW_OUTPUT is set to 0.
I believe this behavior is shortly discussed in the issue #36949
### Link
[How to control console output with tune](https://docs.ray.io/en/latest/tune/tutorials/tune-output.html#how-to-control-console-output-with-tune)
and
opened 08:56AM - 29 Jun 23 UTC
stale
# Experimental features in Ray AIR
The Ray Team is testing a number of experime… ntal features in Ray AIR.
During development, the features are disabled per default. You can opt-in by setting a feature-specific environment variable.
After some time, the Ray Team enables the feature by default to gather more feedback from the community. In that case, you can still disable the feature using the same environment variable to fully revert to the old behavior.
If you run into issues with experimental features, [open an issue](https://github.com/ray-project/ray/issues/) on GitHub. The Ray Team considers feedback before removing the old implementation and making the new implementation the default.
> ### Note
> Experimental features can undergo frequent changes, especially on the master branch and the nightly wheels.
## Context-aware progress reporting
> ### Note
> This feature is enabled by default in Ray 2.6.
> To disable, set the environment variable `RAY_AIR_NEW_OUTPUT=0`.
A context-aware output engine is available for Ray Train and Ray Tune runs.
This output engine affects how the training progress is printed in the console. The output changes depending on the execution context: Ray Tune runs will be displayed differently to Ray Train runs.
The features include:
- Ray Train runs report status relevant to the single training run. It does not use the default Ray Tune table layout from previous versions.
- The table format has been updated.
- The format of reporting configurations and observed metrics is different from pervious versions.
- Significant reduction in the default metrics displayed in the console output for runs (e.g., RLlib runs).
- Decluttered the output to improve readability.
This output feature only works for the regular console. It is automatically disabled when you use Jupyter Notebooks or Ray client.
## Rich layout (sticky status)
> Note
> This feature is disabled by default.
> To enable, set the environment variable `RAY_AIR_RICH_LAYOUT=1`.
The [context-aware output engine](https://docs.ray.io/en/master/ray-air/experimental-features.html#air-experimental-new-output) exposes an advanced layout using the [rich](https://github.com/Textualize/rich) library.
The rich layout provides a sticky status table: The regular console logs are still printed as before, but the trial overview table (in Ray Tune) is stuck to the bottom of the screen and periodically updated.
This feature is still in development. You can opt-in to try it out.
To opt-in, set the `RAY_AIR_RICH_LAYOUT=1` environment variable and install rich (pip install rich).
## Event-based trial execution engine
> Note
> This feature is enabled by default starting Ray 2.5.
> To disable, set the environment variable `TUNE_NEW_EXECUTION=0`.
Ray Tune has an updated trial execution engine. Since Ray Tune is also the execution backend for Ray Train, the updated engine affects both tuning and training runs.
The update is a refactor of the [TrialRunner](https://docs.ray.io/en/master/tune/api/internals.html#trialrunner-docstring) which uses a generic Ray actor and future manager instead of the previous RayTrialExecutor. This manager exposes an interface to react to scheduling and task execution events, which makes it easier to maintain and develop.
This is a drop-in replacement of an internal class, and you shouldn’t see any change to the previous behavior.
However, if you notice any odd behavior, you can opt out of the event-based execution engine and see if it resolves your problem.
In that case, please [open an issue](https://github.com/ray-project/ray/issues/) on GitHub, ideally with a reproducible script.
Things to look out for:
- Less trials are running in parallel than before
- It takes longer to start new trials (or goes much faster)
- The tuning run finishes, but the script does not exit
- The end-to-end runtime is much slower than before
- The CPU load on the head node is high, even though the training jobs don’t require many resources or don’t run on the head node
Any exceptions are raised that indicate an error in starting or stopping trials or the experiment
Note that some edge cases may not be captured in the regression tests. Your feedback is welcome.
Crawling trough the code I think the CLIReporter is not used anymore in favor of the new reporters that came with the new RAY_AIR_NEW_OUTPUT
.