GRAFT: GRaPH and Table Reasoning for Textual Alignment -- A Benchmark for Structured Instruction Following and Visual Reasoning Paper • 2508.15690 • Published Aug 21 • 8
view article Article SyGra: The One-Stop Framework for Building Data for LLMs and SLMs By ServiceNow-AI and 3 others • 15 days ago • 11
AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs Paper • 2509.08031 • Published 27 days ago • 21
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 38
M-RewardBench: Evaluating Reward Models in Multilingual Settings Paper • 2410.15522 • Published Oct 20, 2024 • 12