HachiML commited on
Commit
7d2571d
·
1 Parent(s): 4500e0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -8
README.md CHANGED
@@ -7,17 +7,18 @@ language:
7
  - ja
8
  ---
9
  ## JGLUE Score
10
- We evaluated our model using the following JGLUE tasks. Here are the scores:
11
- | Task | Score |
12
- |---------------------|----------:|
13
- | JCOMMONSENSEQA(acc) | 75.78 |
14
- | JNLI(acc) | 50.69 |
15
- | MARC_JA(acc) | 79.64 |
16
- | JSQUAD(exact_match) | 62.83 |
17
- | **Average** | **67.23** |
18
  - Note: Use v0.3 prompt template
19
  - The JGLUE scores were measured using the following script:
20
  [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable)
 
21
 
22
  ## How to use
23
 
 
7
  - ja
8
  ---
9
  ## JGLUE Score
10
+ I evaluated this model using the following JGLUE tasks. Here are the scores:
11
+ | Task | Llama-2-7b-hf (*) | This Model |
12
+ |---------------------|:-----------------:|:----------:|
13
+ | JCOMMONSENSEQA(acc) | 51.56 | 75.78 |
14
+ | JNLI(acc) | 29.74 | 50.69 |
15
+ | MARC_JA(acc) | 85.72 | 79.64 |
16
+ | JSQUAD(exact_match) | 64.16 | 62.83 |
17
+ | **Average** | **57.79** | **67.23** |
18
  - Note: Use v0.3 prompt template
19
  - The JGLUE scores were measured using the following script:
20
  [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable)
21
+ - (*) Refer to the following article: [Google Colab での JP Language Model Evaluation Harness による日本語LLMの評価手順](https://note.com/npaka/n/nedf4dacd4037)
22
 
23
  ## How to use
24