HW2 Classical AutoML — AutoGluon TabularPredictor

Model Overview

This model was trained using [AutoGluon TabularPredictor] as part of Homework 2 for 24-679.
It predicts the target column (color) of Scotty’s HW1 tabular dataset based on a set of numeric flower features (diameter, petal length, petal width, petal count, stem height).

The workflow demonstrates how classical AutoML can search across multiple baseline models (e.g., Random Forest, Gradient Boosting, Logistic Regression, Neural Net) with automatic preprocessing, feature generation, and hyperparameter tuning.

Dataset

Source: Scotty’s HW1 tabular dataset on Hugging Face (scottymcgee/flowers)
Samples: ~30 original samples, expanded via augmentation
Features: numeric (flower_diameter_cm, petal_length_cm, petal_width_cm, petal_count, stem_height_cm)
Target: color (multiclass, 6 possible values)
Split: 80% training, 20% validation

Training Configuration

Framework: AutoGluon TabularPredictor
Presets: medium_quality (balanced speed vs. accuracy)
Problem Type: multiclass classification
Time Limit: 600 seconds (10 minutes)
Random Seed: 42 (for reproducible train/val split)
Hardware: Google Colab CPU/GPU runtime

AutoGluon automatically handled:

Standardization of numeric features
Encoding of categorical features (none in this dataset)
Model ensembling and stacking

Results

Best model: Reported by AutoGluon leaderboard
Validation Metric (Weighted F1): ~0.9 (exact value depends on random seed / run)
Leaderboard: includes candidate models such as RandomForest, ExtraTrees, GradientBoosting, LightGBM

Note: Due to the small dataset size, metrics may vary slightly across runs.

Repository Artifacts

autogluon_predictor.pkl → cloudpickled predictor (loadable if library versions match)
autogluon_predictor_dir.zip → zipped native AutoGluon directory (preferred for portability)

AI Tool Disclosure

This notebook used ChatGPT for scaffolding code and documentation. All dataset selection, training, evaluation, and uploads were performed by the student.

Downloads last month: -; Downloads are not tracked for this model. How to track

Dataset used to train george2cool36/hw2_classical_automl

Evaluation results

accuracy on scottymcgee/flowers
test set self-reported

0.870
f1_macro on scottymcgee/flowers
test set self-reported

0.840

View on Papers With Code