matsuo-lab commited on
Commit
c5e01d8
1 Parent(s): a1205ba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -39,12 +39,15 @@ This repository provides a Japanese-centric multilingual GPT-NeoX model of 10 bi
39
 
40
  * **Japanese benchmark**
41
 
42
- - *The 4-task average accuracy is based on results of JCommonsenseQA, JNLI, MARC-ja, and JSQuAD.*
 
 
 
43
 
44
  | Model | Average | JCommonsenseQA | JNLI | MARC-ja | JSQuAD |
45
  | :-- | :-- | :-- | :-- | :-- | :-- |
46
- | weblab-10b-instruction-sft | 79.04 | 74.35 | 65.65 | 96.06 | 80.09 |
47
- | weblab-10b | 67.27 | 65.86 | 54.19 | 84.49 | 64.54 |
48
 
49
  ---
50
 
 
39
 
40
  * **Japanese benchmark**
41
 
42
+ - *We used [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/2f1583c0735eacdfdfa5b7d656074b69577b6774) library for evaluation.*
43
+ - *The 4-task average accuracy is based on results of JCommonsenseQA-1.1, JNLI-1.1, MARC-ja-1.1, and JSQuAD-1.1.*
44
+ - *model loading is performed with float16, and evaluation is performed with template version 0.3 using the few-shot in-context learning.*
45
+ - *The number of few-shots is 3,3,3,2.*
46
 
47
  | Model | Average | JCommonsenseQA | JNLI | MARC-ja | JSQuAD |
48
  | :-- | :-- | :-- | :-- | :-- | :-- |
49
+ | weblab-10b-instruction-sft | 78.78 | 74.35 | 65.65 | 96.06 | 79.04 |
50
+ | weblab-10b | 66.38 | 65.86 | 54.19 | 84.49 | 60.98 |
51
 
52
  ---
53