Vision Transformer - Golden Foot Football Players

Este modelo es un Vision Transformer (ViT) afinado sobre el dataset Golden Foot Football Players.

Metodología

  • Modelo base: google/vit-base-patch16-224
  • Dataset: 22 clases (jugadores nominados al Golden Foot)
  • Técnica: Transfer Learning (última capa reentrenada)
  • Optimizer: AdamW
  • Scheduler: StepLR (gamma=0.5 cada 2 épocas)
  • Balanceo: WeightedRandomSampler

Resultados

  • Accuracy en test: 0.91
  • Precision: 0.91
  • Recall: 0.91
  • F1-score: 0.91

Limitaciones

  • Dataset con desbalance (ej: jugadores con pocas imágenes)
  • Imágenes con resoluciones heterogéneas

Uso

```python from transformers import ViTForImageClassification, ViTImageProcessor from PIL import Image

model = ViTForImageClassification.from_pretrained("aaronqg/vit-football-players") processor = ViTImageProcessor.from_pretrained("aaronqg/vit-football-players")

image = Image.open("jugador.jpg").convert("RGB") inputs = processor(images=image, return_tensors="pt") outputs = model(**inputs) pred = outputs.logits.argmax(-1).item() print("Predicción:", pred) ```

Downloads last month
-
Safetensors
Model size
85.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train aaronqg/vit-football-players