I have a PyTorch tabular model that I have set up to tune using the following:
config = {
"num_layers": tune.choice([1, 2, 3]),
"num_trees": tune.choice([512, 768, 1024]),
"depth": tune.choice([2, 4, 6]),
"batch_size": tune.choice([128, 512, 1024])
}
def train_tabular(config, df_train, df_test):
model = build_model(num_trees=config['num_trees'], depth=config['depth'], num_layers=config['num_layers'], batch_size=config['batch_size'], use_embedding=True, epochs=10)
model.fit(train=df_train, validation=df_test)
eval = model.evaluate(df_test)
tune.report(mse=eval[0]['test_mean_squared_error'])
analysis = tune.run(
tune.with_parameters(train_tabular, df_train=df_train, df_test=df_val),
resources_per_trial={'gpu': 1},
mode="min",
config=config)
However, when this runs it terminates after the first configuration:
Trial name status loc batch_size depth num_layers num_trees iter total time (s) mse
train_tabular_623df_00000 TERMINATED 128 6 2 1024 1 4951.7 0.0099718
Output was trimmed for performance reasons.
To see the full output set the setting "jupyter.textOutputLimit" to 0.
...
2021-09-02 01:55:58,998 INFO tune.py:561 -- Total run time: 4954.25 seconds (4954.01 seconds for the tuning loop).
Best config: {'num_layers': 2, 'num_trees': 1024, 'depth': 6, 'batch_size': 128}
Does anything see something obviously wrong with what I have set up here? There is no error message, it just stops running after the first model is run.