Commit History
Upload from GitHub Actions: Fix vibecoding
75010c2
verified
Upload from GitHub Actions: Ugly fix for CI errors
adc94d7
verified
Upload from GitHub Actions: Try moving `cache` calls that cause CI issues
bc4afa0
verified
Upload from GitHub Actions: Exclude free models from evals
c9e9db6
verified
Upload from GitHub Actions: Display N/A scores as such
1e8952a
verified
Block gemini-2.5-pro-exp-03-25
092c06a
David Pomerenke
commited on
Pass through kwargs
5fa433f
David Pomerenke
commited on
Fix dataset loading
c990cb9
David Pomerenke
commited on
Temporarily disable classification task
a48ff53
David Pomerenke
commited on
Fix path and dev group declaration
1614427
David Pomerenke
commited on
Fix import paths
c567aee
David Pomerenke
commited on
added download function and edited INFO
f529b7b
Use most popular current + historical models
9983b5f
David Pomerenke
commited on
Only run tasks for which there is no result yet
2f9dee1
David Pomerenke
commited on
Run on 40 languages, additional models
260c1a3
David Pomerenke
commited on
Shorter classification prompt + error handling
0384b92
David Pomerenke
commited on
Move functions for sharing them
55406ba
David Pomerenke
commited on
Fix response when no evals data is available
32d50b0
David Pomerenke
commited on
Fix: don't cache model metadata forever
c29b8da
David Pomerenke
commited on
Run on 15 languages
f8a3dad
David Pomerenke
commited on
Update models
8941a67
David Pomerenke
commited on
Implement MMLU task
a683732
David Pomerenke
commited on
MMLU data loader for 3 parallel datasets
47170a5
David Pomerenke
commited on
Analyze MMLU datasets
031925d
David Pomerenke
commited on
Add Global MMLU benchmark
ce2acb0
David Pomerenke
commited on
Translation both from and to
731eddd
David Pomerenke
commited on
Get popular models from OpenRouter
a32a92f
David Pomerenke
commited on
Add OpenRouter metadata to models
9002fc2
David Pomerenke
commited on
Run on 100 languages, adjust display
8274634
David Pomerenke
commited on
Add Dockerfile
4d13673
David Pomerenke
commited on
Fix world map and apply filters for it
92d8154
David Pomerenke
commited on
Fix and refactor backend filtering
eb1696c
David Pomerenke
commited on
Speed things up
566c57e
David Pomerenke
commited on
Language selection checkboxes & filtering in backend
d91b022
David Pomerenke
commited on
Basic backend setup with FastApi but without actual filtering
2c21cf7
David Pomerenke
commited on
Add OpenGPT-X
43057f8
David Pomerenke
commited on
spBLEU tokenizer, run on more languages
eaf2d97
David Pomerenke
commited on
Better map tooltip
92b2164
David Pomerenke
commited on
Process data for country map
723f963
David Pomerenke
commited on
Autonymns and cooler dataset search display
33469f2
David Pomerenke
commited on
Nicer layout for datasets table and other tables
430bde6
David Pomerenke
commited on
Datasets table
11c32ae
David Pomerenke
commited on
More models
c5278dd
David Pomerenke
commited on
Basic language table
d1a7111
David Pomerenke
commited on
Nicer model table with type and size and filters and colourful score bars
9dbdcb2
David Pomerenke
commited on
Params and license metadata from HF API
3ed02d5
David Pomerenke
commited on
Refactor eval code into files
da6e1bc
David Pomerenke
commited on