Improve model card: Update pipeline tag, add library name, and enrich content for Video-MTR
#1
by
nielsr
HF Staff
- opened
This PR significantly improves the model card for the Video-MTR
model by:
- Updating the
pipeline_tag
: Changed fromvisual-question-answering
tovideo-text-to-text
. This new tag more accurately reflects the model's capabilities in long video understanding and multi-turn reasoning for question comprehension, enhancing its discoverability on the Hub (https://huggingface.co/models?pipeline_tag=video-text-to-text). - Adding
library_name: transformers
: Evidence fromconfig.json
(e.g.,architectures: ["Qwen2_5_VLForConditionalGeneration"]
,transformers_version: "4.49.0"
) confirms compatibility with the Hugging Facetransformers
library, enabling automated "how to use" code snippets for users. - Expanding the model card content:
- Added the full paper title as the main heading.
- Included the complete paper abstract to provide detailed insights into the model's methodology and contributions.
- The paper link in the content remains
https://arxiv.org/abs/2508.20478
as per instructions.
Please note that the provided GitHub repository and project page URLs were found to be for a different project ("UniMuMo") and have therefore been omitted from this model card to maintain accuracy. No sample usage was included as no relevant code snippets were found.