Spaces:

fair-forward
/

evals-for-every-language

Running

App Files Files Community

evals-for-every-language / evals

Ctrl+K

Ctrl+K

5 contributors

History: 69 commits

davidpomerenke's picture

Upload from GitHub Actions: Exclude TruthfulQA from proficiency score

3fbff09 verified 3 days ago

datasets_
Upload from GitHub Actions: TruthfulQA translation WIP 3 days ago
__init__.py

1 Bytes

Refactor eval code into files 4 months ago
backend.py

5.11 kB

Upload from GitHub Actions: Exclude TruthfulQA from proficiency score 3 days ago
countries.py

1.42 kB

Add Dockerfile 3 months ago
download_data.py

8.44 kB

Upload from GitHub Actions: Use FLORES+ via Huggingface about 2 months ago
languages.py

2.08 kB

Upload from GitHub Actions: More results about 2 months ago
main.py

2.5 kB

Upload from GitHub Actions: Get more results, compute average based on all tasks 5 days ago
models.py

9.5 kB

Upload from GitHub Actions: Get more results, compute average based on all tasks 5 days ago
plots.py

4.8 kB

Upload from GitHub Actions: TruthfulQA translation WIP 3 days ago
tasks.py

14.4 kB

Upload from GitHub Actions: Get more results, compute average based on all tasks 5 days ago
translate.py

272 Bytes

Upload from GitHub Actions: Translate MMLU and evaluate 7 days ago