Concept of `training_iteration`

Mayank_Bhardwaj · February 13, 2023, 5:56pm

I am using ray.tune.run() for tuning hyperparameters. I have given the stop criteria as: {"mean_accuracy": 0.75, "training_iteration":100}. When I execute, the Trial Status always shows ‘1’ under itr and accuracy is either ‘0’ or ‘1.66667’. I think there is some misunderstanding on my part as to what an ‘iteration’ means and want some clarity on that.

Edit - Adding code and console log output for reference

Note -

I have taken these 6 books for debugging purpose. This is not the dataset I’ll be using.
This is the link to the paper I am using.
NewIndex calculates the ‘Clustering Evaluation Index’ mentioned in the paper.
Ks is the number of clusters over which Index, I will be evaluated.

I hope rest of the variables are self explanatory.

Please find Fig - 2 and 3 in comments. For some reason I am not allowed to add more than 1 Image in the body.
Sorry for the inconvenience.

def train_new_index(config):
  books = ["shakespeare","shakespeare_jane_austen","shakespeare_jane_austen_holmes","jane_austen","holmes","hp"]
  authors = [1,2,3,1,1,1]

  alpha1 = config['alpha1']
  alpha2 = config['alpha2']
  delta = config['delta']

  correct = 0

  for i in range(len(books)):
    text = open(books[i] + ".txt").read()
    chunkSize = 22
    step = 22
    
    features_list = FeatureExtraction(text, chunkSize, step)
    reduced_dimension_features_list = DimensionReduction(features_list)
    
    I,cluster = NewIndex(Ks=6, reduced_dimension_features_list, alpha1, alpha2, delta)
    if (cluster == authors[i]): 
      correct += 1
  accuracy = correct/len(authors)
  tune.report(mean_accuracy=accuracy)

search_space = {
    "alpha1": tune.uniform(0.0, 1.0),
    "alpha2": tune.uniform(0.0, 1.0),
    "delta": tune.uniform(0.0, 1.0),
}

ray.init(ignore_reinit_error=True)

optuna_search = OptunaSearch(
    space=search_space,
    metric="mean_accuracy",
    mode="max"
)

asha_scheduler = ASHAScheduler(
    max_t=100,
    metric='mean_accuracy',
    mode='max'
)

tuner = tune.run(
    run_or_experiment=train_new_index,
    name="new_clustering_evaluation_index_tune",
    search_alg=optuna_search,
    num_samples=50,
    scheduler=asha_scheduler,
    stop={"mean_accuracy": 0.75, "training_iteration":100},
    chdir_to_trial_dir=False
)

results = tuner.fit()

Console Log Output

2023-02-13 19:00:48,873	INFO worker.py:1370 -- Calling ray.init() again after it has already been called.
2023-02-13 19:00:48,878	WARNING optuna_search.py:330 -- You passed a `space` parameter to OptunaSearch that contained unresolved search space definitions. OptunaSearch should however be instantiated with fully configured search spaces only. To use Ray Tune's automatic search space conversion, pass the space definition as part of the `param_space` argument to `tune.Tuner()` instead.
[I 2023-02-13 19:00:48,884] A new study created in memory with name: optuna

<Tune Status Table; refer to the Fig. - 1>

(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:22: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:23: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:22: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:23: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:22: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:23: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:22: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:23: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:22: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:23: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
.
.
.
.
.
A lot of repeated lines

<Trial Progress Table; Refer to Fig.2 - 4>

(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:22: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41585) <ipython-input-41-fe840bf012af>:23: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41641) <ipython-input-41-fe840bf012af>:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
(train_new_index pid=41641) <ipython-input-41-fe840bf012af>:22: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
.
.
.
.
This continues.......
.
.
.
.

Thanks for your help and patience!

xwjiang2010 · February 13, 2023, 6:26pm

could you share your script and console log output?
Thanks!

Mayank_Bhardwaj · February 13, 2023, 7:30pm

@xwjiang2010 I have updated the body. Also rest of the images are attatched below in the comments:

Thanks!

Mayank_Bhardwaj · February 13, 2023, 7:31pm

Mayank_Bhardwaj · February 13, 2023, 7:31pm

xwjiang2010 · February 13, 2023, 10:45pm

In your case, for each trial, session.report() is only called once with one result. So that’s why the iteration is showing 1. Also most trial has accuracy of 0.666667. This probably means that the performance is not very impacted by the hyperparameters.

Although you set training_iteraiton to stop at 100. But since there is only 1, it just already stops.

Mayank_Bhardwaj · February 14, 2023, 3:58am

I understood the ‘iteration’ part. However,

Can you give a little more detail on this, because in both, Trial Status(Fig - 1) and Trial Progress(Fig - 3), the acc and mean_accuracy columns have 0.16667 value.
Or was this a typo.

Thanks.

xwjiang2010 · February 14, 2023, 3:50pm

I think “acc” (the one that shows up in the progress reporter) and “mean_accuracy” (as you reported) are the same thing. So they both have the same value 0.16667 meaning 1 out of 6 categories gets the right clustering.

Mayank_Bhardwaj · February 14, 2023, 5:12pm

No, I meant for the value you quoted( 0.666667).
But now I get it, it was a typo.

Thanks for your help!

Topic		Replies	Views
Concept of trial and iteration Ray Tune	4	1430	March 14, 2022
Could not find best trial Ray Tune	8	2860	December 21, 2020
How to manipulate the `training iteration` for each trial in Ray Tune Ray Tune	3	574	February 21, 2023
"Iteration always 1" challenge Ray Tune	3	578	April 25, 2023
Trial returned a result which did not include the specified metric(s) `eval_acc` that `PopulationBasedTraining` expects Ray Tune	2	1565	April 18, 2023

Concept of `training_iteration`

Related topics