🍽 Cuisine Classifier (XGBoost)

This model classifies dishes based on their ingredients and assigns them either to a Cuisine (20 classes) or a Region (5 classes).
It uses an XGBoost classifier trained on normalized ingredient data.


📊 Model Overview

  • Task: Multiclass Classification (Cuisines & Regions)
  • Input: List of ingredients (["salt", "flour", "sugar", ...])
  • Output: Cuisine class (e.g. "italian") or Region (e.g. "Central Europe")
  • Algorithm: XGBoost
  • Training Data: Kaggle What’s Cooking? dataset, ingredients normalized using AllRecipes dataset
  • Train/Test Split: 80 / 20, stratified
  • Cross Validation: 5-fold CV with random_state=42

🌍 Region Mapping

Region Cuisines
Central Europe british, french, greek, irish, italian, russian, spanish
North America cajun_creole, southern_us
Asia chinese, filipino, indian, japanese, korean, thai, vietnamese
Middle East moroccan
Latin America mexican, jamaican, brazilian

🧪 Performance

Model Comparison

Metric Stratified Baseline Logistic Regression XGBoost
Precision (20 cuisines) 0.05 0.65 0.75
Recall (20 cuisines) 0.05 0.69 0.66
Macro F1 (20 cuisines) 0.05 0.67 0.69
Accuracy (20 cuisines) 0.10 0.75 0.77
Accuracy (5 regions) 0.27 0.89 0.89

Conclusion:
XGBoost achieves the best results for the 20-class cuisine classification and clearly outperforms the baseline.
For the 5-region setting, Logistic Regression and XGBoost perform nearly identically — however, XGBoost provides more consistent results across classes.


Per-Region Metrics (5 Classes)

Region Precision (XGB) Recall (XGB) F1 (XGB)
Asia 0.94 0.92 0.93
Central Europe 0.85 0.93 0.89
Latin America 0.92 0.88 0.90
Middle East 0.88 0.74 0.81
North America 0.87 0.76 0.81

🚀 How to Use

from huggingface_hub import hf_hub_download
import joblib

class CuisineClassifier:

    def __init__(self, classifier="region"):
        print("Initializing CuisineClassifier...")

        components = ["cuisine_pipeline", "label_encoder"]
        paths = {}

        print("Downloading files from Hugging Face Hub...")
        for name in components:
            print(f"Downloading {name}.joblib ...")
            try:
                paths[name] = hf_hub_download(
                    repo_id="NoahMeissner/CuisineClassifier", 
                    filename=f"region_classifier/{name}.joblib"
                    if classifier == "cuisine":
                      filename=f"cuisine_classifier/{name}.joblib"
                )
                print(f"{name} downloaded.")
            except Exception as e:
                print(f"Failed to download {name}: {e}")
                raise

        print("Loading model components with joblib...")
        try:
            self.model = joblib.load(paths["cuisine_pipeline"])
            print("Model loaded.")
            self.label_encoder = joblib.load(paths["label_encoder"])
            print("Label encoder loaded.")
        except Exception as e:
            print(f"Failed to load components: {e}")
            raise

        print("All components loaded successfully.")

    def classify(self, text_input):
        data = " ".join(text_input)
        predicted_class = self.model.predict([data])
        predicted_label = self.label_encoder.inverse_transform(predicted_class)
        return predicted_label
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results