---
datasets:
- Elsafty
- Chula
- DSE
library_name: timm
license: cc-by-nc-4.0
pipeline_tag: image-feature-extraction
tags:
- red-blood-cells
- hematology
- medical-imaging
- vision-transformer
- dino
- dinov2
- feature-extraction
- foundation-model
model-index:
- name: RedDino-large
  results:
  - task:
      type: image-classification
      name: RBC Shape Classification
    dataset:
      name: Elsafty
      type: Classification
    metrics:
    - type: Weighted F1
      value: 88.5
    - type: Balanced Accuracy
      value: 89.1
    - type: Accuracy
      value: 88.4
    - type: Weighted F1
      value: 83.9
    - type: Balanced Accuracy
      value: 79.0
    - type: Accuracy
      value: 85.0
    - type: Weighted F1
      value: 86.6
    - type: Balanced Accuracy
      value: 60.1
    - type: Accuracy
      value: 86.6
---

# RedDino: A Foundation Model for Red Blood Cell Analysis

**RedDino** is a self-supervised Vision Transformer foundation model specifically designed for **red blood cell (RBC)** image analysis, as presented in the paper [RedDino: A foundation model for red blood cell analysis](https://arxiv.org/abs/2508.08180).

It leverages a tailored version of the **DINOv2** framework, trained on a meticulously curated dataset of **1.25 million RBC images** from diverse acquisition modalities and sources. This model excels at extracting robust, general-purpose features for downstream hematology tasks such as **shape classification**, **morphological subtype recognition**, and **batch-effect–robust analysis**.

Unlike general-purpose models pretrained on natural images, RedDino incorporates hematology-specific augmentations, architectural tweaks, and RBC-tailored data preprocessing, enabling **state-of-the-art performance** on multiple RBC benchmarks.

> 🧠 Developed by [Luca Zedda](https://orcid.org/0009-0001-8488-1612), [Andrea Loddo](https://orcid.org/0000-0002-6571-3816), [Cecilia Di Ruberto](https://orcid.org/0000-0003-4641-0307), and [Carsten Marr](https://orcid.org/0000-0003-2154-4552)  
> 🏥 University of Cagliari & Helmholtz Munich  
> 📄 Preprint: [arXiv:2508.08180](https://arxiv.org/abs/2508.08180)  
> 💻 Code: [https://github.com/Snarci/RedDino](https://github.com/Snarci/RedDino)

---

## Model Details

-   **Architecture:** ViT-large, patch size 14
-   **SSL framework:** DINOv2 (customized for RBC morphology)
-   **Pretraining dataset:** Curated RBC images from 18 datasets (multiple modalities and sources)
-   **Embedding size:** 1024
-   **Intended use:** RBC morphology classification, feature extraction, batch-effect–robust analysis
Notes:
-   RBC-specific training strategy including removal of KoLeo regularizer and Sinkhorn-Knopp centering.
-   Training on smear patches (not only single cells) to enhance cross-source generalization.

## Example Usage
```python
from PIL import Image
from torchvision import transforms
import timm
import torch
# Load model from Hugging Face Hub
model = timm.create_model("hf_hub:Snarcy/RedDino-large", pretrained=True)
model.eval()
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
# Load and preprocess image
image = Image.open("path/to/rbc_image.jpg").convert("RGB")
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225]),
])
input_tensor = transform(image).unsqueeze(0).to(device)
# Extract features
with torch.no_grad():
    embedding = model(input_tensor)
```

## Model Variants

RedDino comes in three sizes to suit different computational requirements and performance needs:

| Model Variant | Embedding Size | Parameters | Usage |
|---------------|----------------|------------|--------|
| **RedDino-small** | 384 | 22M | `timm.create_model("hf_hub:Snarcy/RedDino-small", pretrained=True)` |
| **RedDino-base** | 768 | 86M | `timm.create_model("hf_hub:Snarcy/RedDino-base", pretrained=True)` |
| **RedDino-large** | 1024 | 304M | `timm.create_model("hf_hub:Snarcy/RedDino-large", pretrained=True)` |

Choose the variant that best fits your computational budget and performance requirements. Larger models generally provide richer feature representations at the cost of increased computational overhead.

---

## Benchmark Results

RedDino was benchmarked on major RBC classification datasets—including Elsafty, Chula, and DSE—outperforming state-of-the-art baselines such as ResNet50, DinoBloom, and DINOv2.

| Model             | Dataset   | Metric      | Linear Probing (wF1) | 1-NN (wF1) | 20-NN (wF1) |
|-------------------|-----------|-------------|----------------------|------------|-------------|
| ResNet50          | Elsafty   | Weighted F1 | 77.6 ± 8.1           | 64.3 ± 4.8 | 66.2 ± 4.9  |
| DinoBloom-S       | Elsafty   | Weighted F1 | 83.2 ± 8.2           | 73.1 ± 5.1 | 76.5 ± 4.2  |
| DINOv2 (small)    | Elsafty   | Weighted F1 | 82.1 ± 8.2           | 73.5 ± 4.8 | 77.2 ± 4.6  |
| RedDino small     | Elsafty   | Weighted F1 | 86.0 ± 7.0           | 76.8 ± 4.9 | 80.0 ± 4.5  |
| RedDino base      | Elsafty   | Weighted F1 | 88.1 ± 4.9           | 78.8 ± 3.6 | 82.6 ± 2.8  |
| RedDino large     | Elsafty   | Weighted F1 | 88.5 ± 5.5           | 78.5 ± 4.6 | 81.6 ± 4.7  |

On Chula and DSE datasets, RedDino consistently surpassed all other models in feature quality (linear probing) with average improvements of 2–4% over prior approaches in key metrics.

---

## Highlights

-   **Foundation model** for RBC analysis trained on the largest available multi-source RBC image set: 1.25M+ images, using advanced CellPose-based instance segmentation and patch extraction.
-   **DINOv2-based self-supervised learning** for label-efficient pretraining and robust, transferable features.
-   **Model architecture and key innovations**:
    -   Patch-based training (224×224 px) shown to outperform single-cell training.
    -   Novel data augmentation via Albumentations (32 pixel-level strategies).
    -   Removal of the Koleo regularizer and adoption of Sinkhorn-Knopp centering for improved representation in RBC-specific domains.
    -   Suite of models (small, base, large) covering 22M–304M parameters.
-   **Generalization**: Strong adaptation across varied protocols, microscopes, and imaging sites. Demonstrated resistance to batch effects and out-of-domain variance.
-   **Interpretability tools**: PCA/UMAP visualizations reveal clustering by phenotype and batch, distinguishing abnormal cells (e.g., malaria, echinocytes).
-   **Easy deployment**: Models and code are available on [GitHub](https://github.com/Snarci/RedDino) and [Hugging Face](https://huggingface.co/collections/Snarcy/reddino-689a13e29241d2e5690202fc).

---

## 📝 Citation

If you use this model, please cite the following paper:

**RedDino: A foundation model for red blood cell analysis**  
Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Carsten Marr — 2025  
Preprint: arXiv:2508.08180. https://arxiv.org/abs/2508.08180

```bibtex
@misc{zedda2025reddinofoundationmodelred,
      title={RedDino: A foundation model for red blood cell analysis}, 
      author={Luca Zedda and Andrea Loddo and Cecilia Di Ruberto and Carsten Marr},
      year={2025},
      eprint={2508.08180},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.08180}, 
}
```

---

## Summary

RedDino is the first family of foundation models tailored for comprehensive red blood cell image analysis, using large-scale self-supervised learning to set new performance benchmarks and generalization standards for computational hematology. Models and pretrained weights are available for research and practical deployment.