Chatbot Arena Leaderboard
Display chatbot performance leaderboard
A collection of Leaderboards for LLMs ⚡️⚖️ 🤗
Display chatbot performance leaderboard
Track, rank and evaluate open LLMs and chatbots
Run a Streamlit web app
View and submit LLM evaluations
Explore hardware performance for LLMs
View and submit machine learning model evaluations
Display and explore model leaderboards and chat history
Embedding Leaderboard
Track, rank and evaluate open LLMs' CoT quality
View LLM performance rankings
Explore and analyze code evaluation data
Display and analyze PyTorch Image Models leaderboard
Evaluating LLMs on Multilingual Multimodal Financial Tasks
VLMEvalKit Eval Results in video understanding benchmark
A leaderboard for multimodal models
Compare Open LLM Leaderboard results
Display document retrieval leaderboard data
Vote on AI responses to rank models
VLMEvalKit Evaluation Results Collection
Interact with multiple chatbots simultaneously
Official Leaderboard for OmniEval
Submit and evaluate models on GAIA benchmark
Blind vote on HF TTS models!
Display text-to-text translation interface
Realtime Image/Video Gen AI Arena
Ranking of LLMs for agentic tasks
Request evaluation for a speech model
A Leaderboard that demonstrates LMM reasoning capabilities
A leaderboard for LLMs powering smolagents
Submit and score model predictions for video and text tasks
KVPress leaderboard: benchmark KV Cache compression methods
LLM Robustness leaderboard