Agent Leaderboard
Ranking of LLMs for agentic tasks
The fastest way to ship reliable AI apps Galileo brings automation and insight to AI evaluations so you can ship with confidence.
Welcome to the official Hugging Face Space for Galileo β bringing automation and insight to AI evaluations so you can ship with confidence.
Galileo helps developers move faster and ship more reliable AI though providing a full suite of tools of evaluation, and debugging tools purpose-built for modern AI workflows β from fine-tuning to RAG pipelines, agents, and more.
This Space hosts static content and community-facing artifacts that align with our mission:
π Evaluation Reports & Visualizations
Insights generated using Galileoβs evaluations and reliability tools.
π οΈ Developer Cookbooks & Tutorials
Lightweight walkthroughs for LLM app builders β with reliability and evals built in.
π¬ Open Source Contributions
Tools, benchmarks, and reference projects that support the open ML ecosystem.
This README is built with β₯ by the Galileo.ai Developer Relations team.