Afri-Aya Gemma 3 4B Vision - LoRA Adapters π
This repository contains the LoRA (Low-Rank Adaptation) adapters for the Afri-Aya Gemma 3 4B Vision model, fine-tuned on the Afri-Aya dataset for African cultural visual question answering.
Model Details
- Base Model:
unsloth/gemma-3-4b-pt
- Training Dataset: CohereLabsCommunity/afri-aya (2,466 images, 13 African languages)
- Fine-tuning Method: LoRA with Unsloth
- Languages Supported: English + 13 African languages
- LoRA Rank: 16
- Training Framework: Unsloth + TRL
Repository Contents
This repository contains the LoRA adapter weights that can be applied to the base Gemma 3 4B model:
adapter_config.json
- LoRA configurationadapter_model.safetensors
- LoRA adapter weightsREADME.md
- This documentation- Other supporting files for the LoRA adapters
Usage
Option 1: Load LoRA Adapters with Unsloth
from unsloth import FastVisionModel
# Load base model with LoRA adapters
model, processor = FastVisionModel.from_pretrained(
model_name="Bronsn/afri-aya-gemma-3-4b-vision-lora",
load_in_4bit=True,
)
# Enable inference mode
FastVisionModel.for_inference(model)
Option 2: Use with PEFT/Transformers
from transformers import AutoModelForVision2Seq, AutoProcessor
from peft import PeftModel
# Load base model
base_model = AutoModelForVision2Seq.from_pretrained(
"unsloth/gemma-3-4b-pt",
torch_dtype=torch.float16,
device_map="auto"
)
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "Bronsn/afri-aya-gemma-3-4b-vision-lora")
processor = AutoProcessor.from_pretrained("unsloth/gemma-3-4b-pt")
Option 3: Merge and Use
For production use, you might want to merge the adapters with the base model:
from unsloth import FastVisionModel
# Load with LoRA
model, processor = FastVisionModel.from_pretrained(
model_name="Bronsn/afri-aya-gemma-3-4b-vision-lora",
load_in_4bit=True,
)
# Merge and save
model = FastVisionModel.merge_and_unload(model)
model.save_pretrained("merged_model")
processor.save_pretrained("merged_model")
Merged Model
For convenience, we also provide a merged version of this model at: Bronsn/afri-aya-gemma-3-4b-vision
The merged model is ready to use without requiring LoRA loading.
Training Details
- LoRA Rank: 16
- LoRA Alpha: 32
- Target Modules: Vision and text projection layers
- Learning Rate: 2e-4
- Batch Size: 1 (with gradient accumulation)
- Epochs: 1
- Training Framework: Unsloth for efficient fine-tuning
Dataset
Trained on the Afri-Aya dataset which includes:
- 2,466 images from 13 African cultures
- Bilingual captions (English + local languages)
- Cultural Q&A pairs for each image
- 13 categories: Food, Festivals, Notable Figures, Music, etc.
Languages Covered
Luganda, Kinyarwanda, Arabic, Twi, Hausa, Nyankore, Yoruba, Kirundi, Zulu, Swahili, Gishu, Krio, Igbo
Example Usage
from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image
# Load model with LoRA adapters
model, processor = FastVisionModel.from_pretrained(
"Bronsn/afri-aya-gemma-3-4b-vision-lora",
load_in_4bit=True,
)
FastVisionModel.for_inference(model)
# Prepare input
image = Image.open("african_cultural_image.jpg")
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "What cultural significance does this image have?"},
{"type": "image"},
],
}
]
# Generate response
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to("cuda")
text_streamer = TextStreamer(processor.tokenizer, skip_prompt=True)
result = model.generate(**inputs, streamer=text_streamer, max_new_tokens=128)
Model Performance
This model has been fine-tuned specifically for:
- African cultural image understanding
- Multilingual visual question answering
- Cultural context recognition
- Traditional and modern African life scenarios
Citation
@model{afri_aya_lora_2024,
title={Afri-Aya Gemma 3 4B Vision LoRA: African Cultural VQA Adapters},
author={Cohere Labs Regional Africa Community},
year={2024},
publisher={HuggingFace},
url={https://huggingface.co/Bronsn/afri-aya-gemma-3-4b-vision-lora}
}
License
Apache 2.0
Acknowledgments
- Dataset: Afri-Aya dataset by Cohere Labs Regional Africa Community
- Base Model: Gemma 3 4B by Google
- Training Framework: Unsloth for efficient LoRA fine-tuning
- Community: Expedition Aya challenge participants
LoRA adapters created with β€οΈ for African culture preservation and education
- Downloads last month
- 11