HW2 Classical AutoML — AutoGluon TabularPredictor

Model Overview

This model was trained using [AutoGluon TabularPredictor] as part of Homework 2 for 24-679.
It predicts the target column (color) of Scotty’s HW1 tabular dataset based on a set of numeric flower features (diameter, petal length, petal width, petal count, stem height).

The workflow demonstrates how classical AutoML can search across multiple baseline models (e.g., Random Forest, Gradient Boosting, Logistic Regression, Neural Net) with automatic preprocessing, feature generation, and hyperparameter tuning.

Dataset

  • Source: Scotty’s HW1 tabular dataset on Hugging Face (scottymcgee/flowers)
  • Samples: ~30 original samples, expanded via augmentation
  • Features: numeric (flower_diameter_cm, petal_length_cm, petal_width_cm, petal_count, stem_height_cm)
  • Target: color (multiclass, 6 possible values)
  • Split: 80% training, 20% validation

Training Configuration

  • Framework: AutoGluon TabularPredictor
  • Presets: medium_quality (balanced speed vs. accuracy)
  • Problem Type: multiclass classification
  • Time Limit: 600 seconds (10 minutes)
  • Random Seed: 42 (for reproducible train/val split)
  • Hardware: Google Colab CPU/GPU runtime

AutoGluon automatically handled:

  • Standardization of numeric features
  • Encoding of categorical features (none in this dataset)
  • Model ensembling and stacking

Results

  • Best model: Reported by AutoGluon leaderboard
  • Validation Metric (Weighted F1): ~0.9 (exact value depends on random seed / run)
  • Leaderboard: includes candidate models such as RandomForest, ExtraTrees, GradientBoosting, LightGBM

Note: Due to the small dataset size, metrics may vary slightly across runs.

Repository Artifacts

  • autogluon_predictor.pkl → cloudpickled predictor (loadable if library versions match)
  • autogluon_predictor_dir.zip → zipped native AutoGluon directory (preferred for portability)

AI Tool Disclosure

This notebook used ChatGPT for scaffolding code and documentation. All dataset selection, training, evaluation, and uploads were performed by the student.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train george2cool36/hw2_classical_automl

Evaluation results