--- library_name: transformers tags: - vision - semantic-segmentation - segformer - sky - sea - obstacle - huggingface license: apache-2.0 datasets: - Wilbur1240/MaSTr1325_512x384 base_model: - nvidia/segformer-b0-finetuned-ade-512-512 --- # Segformer Fine-Tuned on Custom Sky/Sea/Obstacle Dataset This model is a fine-tuned version of `nvidia/segformer-b0-finetuned-ade-512-512` on a custom dataset with 3 semantic classes: - **Sky** - **Sea** - **Obstacle** It is intended for use in vision-based autonomous surface navigation and maritime scene understanding. --- ## Model Details ### Model Description - **Base architecture:** SegFormer-B0 - **Pretrained on:** ADE20K dataset - **Fine-tuned for:** Semantic segmentation on maritime images - **Number of classes:** 3 - **Ignore index:** 255 - **Resolution:** 512×512 input images - **Training precision:** fp32 - **Framework:** PyTorch with 🤗 Transformers --- ### Model Sources - **Base model:** [`nvidia/segformer-b0-finetuned-ade-512-512`](https://huggingface.co/nvidia/segformer-b0-finetuned-ade-512-512) - **Codebase:** Uses Hugging Face Transformers + Datasets --- ## Usage ```python from transformers import AutoModelForSemanticSegmentation, AutoImageProcessor from PIL import Image import torch # Load model and processor model = AutoModelForSemanticSegmentation.from_pretrained("Wilbur1240/segformer-b0-finetuned-ade-512-512-finetune-mastr1325") processor = AutoImageProcessor.from_pretrained("Wilbur1240/segformer-b0-finetuned-ade-512-512-finetune-mastr1325") # Load and preprocess an image image = Image.open("example.jpg").convert("RGB") inputs = processor(images=image, return_tensors="pt") # Inference with torch.no_grad(): outputs = model(**inputs) logits = outputs.logits # [1, num_classes, H, W] pred_seg = logits.argmax(dim=1) # [1, H, W]