SaSaSa2VA Model Zoo
Collection
Models and challenge report for Segmentation Augmented and Selective Averaged Sa2VA (SaSaSa2VA).
β’
3 items
β’
Updated
β’
1
[π arXiv] [π§βπ» GitHub] [π€ HuggingFace] [π― Challenge]
Quanzhu Niu1* Β· Dengxian Gong1* Β· Shihao Chen1* Β· Tao Zhang1* Β· Yikang Zhou1 Β· Haobo Yuan2 Β· Lu Qi1 Β· Xiangtai Li3 Β· Shunping Ji1β
1WHUββββ2UC Mercedββββ3NTU
*equal contributionββ corresponding author
We win 1st place in ICCV 2025 LSVOS (Large-scale Video Object Segmentation) challenge RVOS (Referring Video Object Segmentation) track. The top 3 teams' methods are all based on Sa2VA. The challenge leaderborad:
Method/Team Name | J&F | Report |
---|---|---|
π SaSaSa2VA (Ours) | 67.45 | π link |
π₯ Ranhong | 64.65 | π link |
π₯ Sa2VA-i | 64.14 | π link |
We provide the following models:
Model Name | Base MLLM | HF Link |
---|---|---|
SaSaSa2VA-4B | InternVL2.5-4B | π€ link |
SaSaSa2VA-14B | InternVL3.5-14B | To be released |
SaSaSa2VA-26B | InternVL2.5-26B | π€ link |
If you find our work useful, please consider referring to the challenge report:
@article{sasasa2va,
title={The 1st Solution for 7th LSVOS RVOS Track: {SaSaSa2VA}},
author={Niu, Quanzhu and Gong, Dengxian and Chen, Shihao and Zhang, Tao and Zhou, Yikang and Yuan, Haobo and Qi, Lu and Li, Xiangtai and Ji, Shunping},
journal={arXiv preprint arXiv:2509.16972},
year={2025}
}