--- license: apache-2.0 base_model: - OpenGVLab/InternVL2_5-8B pipeline_tag: mask-generation ---
# HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model [\[📂 GitHub\]](https://github.com/yayafengzi/LMM-HiMTok) [\[📜 Paper\]](https://arxiv.org/abs/2503.13026)
This is InternVL2_5-HiMTok-8B model fine-tuned on the refcoco series train dataset. If you find this project useful in your research, please consider citing: ```BibTeX @article{wang2025himtok, title={HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model}, author={Wang, Tao and Cheng, Changxu and Wang, Lingfeng and Chen, Senda and Zhao, Wuyue}, journal={arXiv preprint arXiv:2503.13026}, year={2025} } ```