Phoebe13/Video-MTR · Improve model card: Update pipeline tag, add library name, and enrich content for Video-MTR

This PR significantly improves the model card for the Video-MTR model by:

Updating the pipeline_tag: Changed from visual-question-answering to video-text-to-text. This new tag more accurately reflects the model's capabilities in long video understanding and multi-turn reasoning for question comprehension, enhancing its discoverability on the Hub (https://huggingface.co/models?pipeline_tag=video-text-to-text).
Adding library_name: transformers: Evidence from config.json (e.g., architectures: ["Qwen2_5_VLForConditionalGeneration"], transformers_version: "4.49.0") confirms compatibility with the Hugging Face transformers library, enabling automated "how to use" code snippets for users.
Expanding the model card content:
- Added the full paper title as the main heading.
- Included the complete paper abstract to provide detailed insights into the model's methodology and contributions.
- The paper link in the content remains https://arxiv.org/abs/2508.20478 as per instructions.

Please note that the provided GitHub repository and project page URLs were found to be for a different project ("UniMuMo") and have therefore been omitted from this model card to maintain accuracy. No sample usage was included as no relevant code snippets were found.