Commit History

Use most popular current + historical models
9983b5f

David Pomerenke commited on

Only run tasks for which there is no result yet
2f9dee1

David Pomerenke commited on

Run on 40 languages, additional models
260c1a3

David Pomerenke commited on

Shorter classification prompt + error handling
0384b92

David Pomerenke commited on

Move functions for sharing them
55406ba

David Pomerenke commited on

Fix response when no evals data is available
32d50b0

David Pomerenke commited on

Fix: don't cache model metadata forever
c29b8da

David Pomerenke commited on

Run on 15 languages
f8a3dad

David Pomerenke commited on

Update models
8941a67

David Pomerenke commited on

Implement MMLU task
a683732

David Pomerenke commited on

MMLU data loader for 3 parallel datasets
47170a5

David Pomerenke commited on

Analyze MMLU datasets
031925d

David Pomerenke commited on

Add Global MMLU benchmark
ce2acb0

David Pomerenke commited on

Translation both from and to
731eddd

David Pomerenke commited on

Get popular models from OpenRouter
a32a92f

David Pomerenke commited on

Add OpenRouter metadata to models
9002fc2

David Pomerenke commited on

Run on 100 languages, adjust display
8274634

David Pomerenke commited on

Add Dockerfile
4d13673

David Pomerenke commited on

Fix world map and apply filters for it
92d8154

David Pomerenke commited on

Fix and refactor backend filtering
eb1696c

David Pomerenke commited on

Speed things up
566c57e

David Pomerenke commited on

Language selection checkboxes & filtering in backend
d91b022

David Pomerenke commited on

Basic backend setup with FastApi but without actual filtering
2c21cf7

David Pomerenke commited on

Add OpenGPT-X
43057f8

David Pomerenke commited on

spBLEU tokenizer, run on more languages
eaf2d97

David Pomerenke commited on

Better map tooltip
92b2164

David Pomerenke commited on

Process data for country map
723f963

David Pomerenke commited on

Autonymns and cooler dataset search display
33469f2

David Pomerenke commited on

Nicer layout for datasets table and other tables
430bde6

David Pomerenke commited on

Datasets table
11c32ae

David Pomerenke commited on

More models
c5278dd

David Pomerenke commited on

Basic language table
d1a7111

David Pomerenke commited on

Nicer model table with type and size and filters and colourful score bars
9dbdcb2

David Pomerenke commited on

Params and license metadata from HF API
3ed02d5

David Pomerenke commited on

Refactor eval code into files
da6e1bc

David Pomerenke commited on