---
license: apache-2.0
base_model:
- OpenGVLab/InternVL2_5-8B
pipeline_tag: mask-generation
---
# HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
[\[📂 GitHub\]](https://github.com/yayafengzi/LMM-HiMTok)
[\[📜 Paper\]](https://arxiv.org/abs/2503.13026)
This is InternVL2_5-HiMTok-8B model fine-tuned on the refcoco series train dataset.
If you find this project useful in your research, please consider citing:
```BibTeX
@article{wang2025himtok,
title={HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model},
author={Wang, Tao and Cheng, Changxu and Wang, Lingfeng and Chen, Senda and Zhao, Wuyue},
journal={arXiv preprint arXiv:2503.13026},
year={2025}
}
```