Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

Afri-Aya Gemma 3 4B Vision - LoRA Adapters 🌍

This repository contains the LoRA (Low-Rank Adaptation) adapters for the Afri-Aya Gemma 3 4B Vision model, fine-tuned on the Afri-Aya dataset for African cultural visual question answering.

Model Details

Base Model: unsloth/gemma-3-4b-pt
Training Dataset: CohereLabsCommunity/afri-aya (2,466 images, 13 African languages)
Fine-tuning Method: LoRA with Unsloth
Languages Supported: English + 13 African languages
LoRA Rank: 16
Training Framework: Unsloth + TRL

Repository Contents

This repository contains the LoRA adapter weights that can be applied to the base Gemma 3 4B model:

adapter_config.json - LoRA configuration
adapter_model.safetensors - LoRA adapter weights
README.md - This documentation
Other supporting files for the LoRA adapters

Usage

Option 1: Load LoRA Adapters with Unsloth

from unsloth import FastVisionModel

# Load base model with LoRA adapters
model, processor = FastVisionModel.from_pretrained(
    model_name="Bronsn/afri-aya-gemma-3-4b-vision-lora",
    load_in_4bit=True,
)

# Enable inference mode
FastVisionModel.for_inference(model)

Option 2: Use with PEFT/Transformers

from transformers import AutoModelForVision2Seq, AutoProcessor
from peft import PeftModel

# Load base model
base_model = AutoModelForVision2Seq.from_pretrained(
    "unsloth/gemma-3-4b-pt",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "Bronsn/afri-aya-gemma-3-4b-vision-lora")
processor = AutoProcessor.from_pretrained("unsloth/gemma-3-4b-pt")

Option 3: Merge and Use

For production use, you might want to merge the adapters with the base model:

from unsloth import FastVisionModel

# Load with LoRA
model, processor = FastVisionModel.from_pretrained(
    model_name="Bronsn/afri-aya-gemma-3-4b-vision-lora",
    load_in_4bit=True,
)

# Merge and save
model = FastVisionModel.merge_and_unload(model)
model.save_pretrained("merged_model")
processor.save_pretrained("merged_model")

Merged Model

For convenience, we also provide a merged version of this model at: Bronsn/afri-aya-gemma-3-4b-vision

The merged model is ready to use without requiring LoRA loading.

Training Details

LoRA Rank: 16
LoRA Alpha: 32
Target Modules: Vision and text projection layers
Learning Rate: 2e-4
Batch Size: 1 (with gradient accumulation)
Epochs: 1
Training Framework: Unsloth for efficient fine-tuning

Dataset

Trained on the Afri-Aya dataset which includes:

2,466 images from 13 African cultures
Bilingual captions (English + local languages)
Cultural Q&A pairs for each image
13 categories: Food, Festivals, Notable Figures, Music, etc.

Languages Covered

Luganda, Kinyarwanda, Arabic, Twi, Hausa, Nyankore, Yoruba, Kirundi, Zulu, Swahili, Gishu, Krio, Igbo

Example Usage

from unsloth import FastVisionModel
from transformers import TextStreamer
from PIL import Image

# Load model with LoRA adapters
model, processor = FastVisionModel.from_pretrained(
    "Bronsn/afri-aya-gemma-3-4b-vision-lora",
    load_in_4bit=True,
)
FastVisionModel.for_inference(model)

# Prepare input
image = Image.open("african_cultural_image.jpg")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "What cultural significance does this image have?"},
            {"type": "image"},
        ],
    }
]

# Generate response
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(image, input_text, add_special_tokens=False, return_tensors="pt").to("cuda")

text_streamer = TextStreamer(processor.tokenizer, skip_prompt=True)
result = model.generate(**inputs, streamer=text_streamer, max_new_tokens=128)

Model Performance

This model has been fine-tuned specifically for:

African cultural image understanding
Multilingual visual question answering
Cultural context recognition
Traditional and modern African life scenarios

Citation

@model{afri_aya_lora_2024,
  title={Afri-Aya Gemma 3 4B Vision LoRA: African Cultural VQA Adapters},
  author={Cohere Labs Regional Africa Community},
  year={2024},
  publisher={HuggingFace},
  url={https://huggingface.co/Bronsn/afri-aya-gemma-3-4b-vision-lora}
}

License

Apache 2.0

Acknowledgments

Dataset: Afri-Aya dataset by Cohere Labs Regional Africa Community
Base Model: Gemma 3 4B by Google
Training Framework: Unsloth for efficient LoRA fine-tuning
Community: Expedition Aya challenge participants

LoRA adapters created with ❤️ for African culture preservation and education

Downloads last month: 11

Model tree for Bronsn/afri-aya-gemma-3-4b-vision-lora

Base model

google/gemma-3-4b-pt

Finetuned

unsloth/gemma-3-4b-pt

Adapter

(2)

this model

Bronsn
/

afri-aya-gemma-3-4b-vision-lora