Could not find best trial

Hi. I am new to using ray tune but I cannot figure out my mistake. I am trying to optimise the hyper parameters for my neural network but I get the following error after running tune.run and when calling get_best_trial which should be the configuration for the best set of hyper parameters. The problem is that it returns 'NoneType'. I use tune.report(mean_error=error_test) in my training function train and then

analysis = tune.run(
    train,
    num_samples=2,
    scheduler=ASHAScheduler(metric="mean_error", mode="min"),
    config=config)

best_trial = analysis.get_best_trial(metric="mean_error", mode="min")

This returns best_trial as None. Can someone help me understand why this is? The error suggests I could have passed the wrong metric but I can’t see how.

Edit
I have now tried to perform the same task with different metrics such as my loss but I still get the same result.

@Paul_V I would suggest turning on verbosity to see if you can see anything interesting –

tune.run(..., verbose=3)

Also, can you try running your tuning job with a simple function for train, like:

def train(config):
    import time
    time.sleep(1)
    tune.report(mean_error=1)

Hi @rliaw. Verbosity only confirms that I should be seeing some result as I can clearly see that all the networks have different values for mean_error. Also your example works perfectly as well. I still don’t understand what is causing the mistake.

Hi @Paul_V, thanks for the update. Could you post the entire stdout of your tuning run (with verbosity=3)? That’d help me narrow down the problem!

2020-12-20 23:44:39,661	WARNING function_runner.py:540 -- Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be `func(config, checkpoint_dir=None)`.
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 64.000: None | Iter 16.000: None | Iter 4.000: None | Iter 1.000: None
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 1/2 (1 RUNNING)
+-------------------+----------+-------+------+----------------+------------+-------------+----------------+------------+-------------+
| Trial name        | status   | loc   |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |
|-------------------+----------+-------+------+----------------+------------+-------------+----------------+------------+-------------|
| train_f1a0a_00000 | RUNNING  |       |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |
+-------------------+----------+-------+------+----------------+------------+-------------+----------------+------------+-------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-44-43
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 1
  mean_error: tensor(1.4092)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 2.0174179077148438
  time_this_iter_s: 2.0174179077148438
  time_total_s: 2.0174179077148438
  timestamp: 1608504283
  timesteps_since_restore: 0
  training_iteration: 1
  trial_id: f1a0a_00001
  
(pid=70662) 0 tensor([11.6238]) tensor([-10.5612])
(pid=70662) error test tensor(1.4092)
Result for train_f1a0a_00000:
  date: 2020-12-20_23-44-44
  done: true
  experiment_id: f2854fc2f62f46469d9aede22ae970e5
  experiment_tag: 0_n2=10,u_hidden_dim=20,u_layers=5,u_rate=0.00029646,v_hidden_dim=10,v_layers=2,v_rate=1.4213e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 1
  mean_error: tensor(1.4153)
  node_ip: 10.0.232.46
  pid: 70663
  time_since_restore: 3.6579511165618896
  time_this_iter_s: 3.6579511165618896
  time_total_s: 3.6579511165618896
  timestamp: 1608504284
  timesteps_since_restore: 0
  training_iteration: 1
  trial_id: f1a0a_00000
  
(pid=70663) 0 tensor([11.8109]) tensor([-10.7405])
(pid=70663) error test tensor(1.4153)
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: None | Iter 4.000: None | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |      2 |          3.94857 |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-44-48
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 4
  mean_error: tensor(1.4090)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 7.22578501701355
  time_this_iter_s: 1.5830650329589844
  time_total_s: 7.22578501701355
  timestamp: 1608504288
  timesteps_since_restore: 0
  training_iteration: 4
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.3/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: None | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |      6 |         10.2233  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-44-54
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 8
  mean_error: tensor(1.4088)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 13.50598692893982
  time_this_iter_s: 1.5855488777160645
  time_total_s: 13.50598692893982
  timestamp: 1608504294
  timesteps_since_restore: 0
  training_iteration: 8
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.3/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: None | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     10 |         16.7524  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


(pid=70662) 10 tensor([11.6229]) tensor([-10.5610])
(pid=70662) error test tensor(1.4087)
Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-01
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 12
  mean_error: tensor(1.4086)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 20.253225803375244
  time_this_iter_s: 1.8119208812713623
  time_total_s: 20.253225803375244
  timestamp: 1608504301
  timesteps_since_restore: 0
  training_iteration: 12
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: None | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     13 |         21.9243  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-07
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 16
  mean_error: tensor(1.4085)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 26.88408899307251
  time_this_iter_s: 1.6711490154266357
  time_total_s: 26.88408899307251
  timestamp: 1608504307
  timesteps_since_restore: 0
  training_iteration: 16
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     17 |         28.5578  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-14
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 20
  mean_error: tensor(1.4083)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 33.45438098907471
  time_this_iter_s: 1.6251630783081055
  time_total_s: 33.45438098907471
  timestamp: 1608504314
  timesteps_since_restore: 0
  training_iteration: 20
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     21 |         35.1464  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


(pid=70662) 20 tensor([11.6219]) tensor([-10.5607])
(pid=70662) error test tensor(1.4082)
Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-19
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 23
  mean_error: tensor(1.4081)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 38.50504493713379
  time_this_iter_s: 1.5996689796447754
  time_total_s: 38.50504493713379
  timestamp: 1608504319
  timesteps_since_restore: 0
  training_iteration: 23
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     24 |         40.1933  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-26
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 27
  mean_error: tensor(1.4079)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 45.02883505821228
  time_this_iter_s: 1.5702321529388428
  time_total_s: 45.02883505821228
  timestamp: 1608504326
  timesteps_since_restore: 0
  training_iteration: 27
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     28 |         46.6981  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-32
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 31
  mean_error: tensor(1.4077)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 51.30670785903931
  time_this_iter_s: 1.354578971862793
  time_total_s: 51.30670785903931
  timestamp: 1608504332
  timesteps_since_restore: 0
  training_iteration: 31
  trial_id: f1a0a_00001
  
(pid=70662) 30 tensor([11.6209]) tensor([-10.5604])
(pid=70662) error test tensor(1.4077)
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     32 |         52.9042  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-38
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 35
  mean_error: tensor(1.4075)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 57.73534893989563
  time_this_iter_s: 1.5731749534606934
  time_total_s: 57.73534893989563
  timestamp: 1608504338
  timesteps_since_restore: 0
  training_iteration: 35
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     36 |         59.4124  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-45
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 39
  mean_error: tensor(1.4074)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 64.22594499588013
  time_this_iter_s: 1.5776619911193848
  time_total_s: 64.22594499588013
  timestamp: 1608504345
  timesteps_since_restore: 0
  training_iteration: 39
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     40 |         65.895   |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


(pid=70662) 40 tensor([11.6198]) tensor([-10.5601])
(pid=70662) error test tensor(1.4073)
Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-51
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 43
  mean_error: tensor(1.4072)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 70.80098509788513
  time_this_iter_s: 1.6036851406097412
  time_total_s: 70.80098509788513
  timestamp: 1608504351
  timesteps_since_restore: 0
  training_iteration: 43
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.4/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     44 |         72.4692  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


Result for train_f1a0a_00001:
  date: 2020-12-20_23-45-58
  done: false
  experiment_id: 919d429592254ef5bcd330e55bf0495a
  experiment_tag: 1_n2=5,u_hidden_dim=20,u_layers=2,u_rate=1.3816e-06,v_hidden_dim=5,v_layers=5,v_rate=2.4968e-05
  hostname: Pauls-MacBook-Pro-2.local
  iterations_since_restore: 47
  mean_error: tensor(1.4070)
  node_ip: 10.0.232.46
  pid: 70662
  time_since_restore: 77.34590792655945
  time_this_iter_s: 1.5704131126403809
  time_total_s: 77.34590792655945
  timestamp: 1608504358
  timesteps_since_restore: 0
  training_iteration: 47
  trial_id: f1a0a_00001
  
== Status ==
Memory usage on this node: 5.6/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 1/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (1 RUNNING, 1 TERMINATED)
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc               |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00001 | RUNNING    | 10.0.232.46:70662 |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     48 |         79.0123  |
| train_f1a0a_00000 | TERMINATED |                   |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
+-------------------+------------+-------------------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


2020-12-20 23:46:03,353	INFO tune.py:439 -- Total run time: 85.54 seconds (83.62 seconds for the tuning loop).
== Status ==
Memory usage on this node: 5.5/8.0 GiB
Using AsyncHyperBand: num_stopped=1
Bracket: Iter 64.000: None | Iter 16.000: -1.4084522724151611 | Iter 4.000: -1.4090211391448975 | Iter 1.000: -1.4106857478618622
Resources requested: 0/4 CPUs, 0/0 GPUs, 0.0/2.15 GiB heap, 0.0/0.73 GiB objects
Result logdir: /Users/paulvalsecchi/ray_results/train_2020-12-20_23-44-39
Number of trials: 2/2 (2 TERMINATED)
+-------------------+------------+-------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+
| Trial name        | status     | loc   |   n2 |   u_hidden_dim |   u_layers |      u_rate |   v_hidden_dim |   v_layers |      v_rate |   iter |   total time (s) |
|-------------------+------------+-------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------|
| train_f1a0a_00000 | TERMINATED |       |   10 |             20 |          5 | 0.000296461 |             10 |          2 | 1.42133e-05 |      1 |          3.65795 |
| train_f1a0a_00001 | TERMINATED |       |    5 |             20 |          2 | 1.38158e-06 |              5 |          5 | 2.49685e-05 |     50 |         82.2582  |
+-------------------+------------+-------+------+----------------+------------+-------------+----------------+------------+-------------+--------+------------------+


2020-12-20 23:46:03,371	WARNING experiment_analysis.py:558 -- Could not find best trial. Did you pass the correct `metric`parameter?
Traceback (most recent call last):
  File "/Users/paulvalsecchi/PycharmProjects/pythonProject/pde-solver GAN.py", line 368, in <module>
    print("Best trial config: {}".format(best_trial.config))
AttributeError: 'NoneType' object has no attribute 'config'

Process finished with exit code 1

Can you try something like tune.report(mean_error=float(mean_error))? (or, float(mean_error.to_numpy())) ?

1 Like

This is a bad error on our side, if you could file an issue on Github we’d be happy to fix it!

cc @kai

Hi. Your suggestion works @rliaw. It is enough to ensure the value is not a tensor. Should I still file an issue on Github?

Yeah, could you? That’d be much appreciated!