Trouble with some results from Ray Tune

Jax · July 22, 2024, 3:29pm

Hi, I am using Ray Tune for a project I’m working on and have mostly had great results, but I have a few open questions that I haven’t been able to find any answers to in the documentation. The questions are below - any help is greatly appreciated!

The training_iteration output shows the same value (1) for almost every trial. Does this indicate that almost all of my initial trials are being done at the same time? I have my max concurrency value set very low, so I’m surprised to see more trials than that running at the same time. Is there a way to limit how many trials run concurrently?
What setting do I need to enable to save model checkpoints? I’d like to save the best performing model at the end of a Tune run and potentially use it for predictions. I set the CheckpointConfig option up, but none of my runs result in any model checkpoints.
Is there a way to control the early stopping of trials? I’d like to experiment with letting trials run longer to see how that impacts results.

Jax · August 7, 2024, 2:28pm

Bump. Any advice on this?

Topic		Replies	Views
Saving best checkpoint - tune is saving first iterations instead Ray Tune	1	497	October 18, 2021
Question - About tune stopping condition with PBT	6	503	February 21, 2023
Questions about tune stopping condition with PBT	1	435	February 27, 2023
How to disable Ray Tune Trial controller checkpointing?	0	295	December 14, 2023
Ray Tune x SLURM - Problem with checkpoints	5	384	March 15, 2023

Trouble with some results from Ray Tune

Related topics