Hi all,

I have a work project where I have 10000 datasets, each dataset consists of Training, Validating, Testing data.

The data-mining/machine learning problem is a binary classification problem. And of course there are a bunch of models to compare and select, e.g. Logistic Regression, xgboost, SVC, Deep Learning etc.

I have to compare classification performance metrics across all datasets and all models.

So the work-flow is:

for model in models:

```
for data in datasets:
train the model, and collect testing metrics.
collect metric numbers for all data and calculate their mean.
```

compare the means across models and make a table of the metrics for all model.

Is there an easy-to-use framework already present for the above model-compare/selection process?

Since these are common models, I would imagine there are already out-of-box solutions for such model comparison.

Could anybody give me some pointers?

Thanks a lot!