CarrotAI commited on
Commit
f9d57c1
โ€ข
1 Parent(s): 1010758

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -32
README.md CHANGED
@@ -29,27 +29,6 @@ The following models were included in the merge:
29
 
30
  ### Score
31
  ```
32
- // use llm-ko-eval
33
- "scores": {
34
- "AVG_llm_kr_eval": "0.4425",
35
- "EL": "0.0522",
36
- "FA": "0.0865",
37
- "NLI": "0.6700",
38
- "QA": "0.5100",
39
- "RC": "0.8937",
40
- "klue_ner_set_f1": "0.0944",
41
- "klue_re_exact_match": "0.0100",
42
- "kmmlu_preview_exact_match": "0.4000",
43
- "kobest_copa_exact_match": "0.8200",
44
- "kobest_hs_exact_match": "0.5500",
45
- "kobest_sn_exact_match": "0.9800",
46
- "kobest_wic_exact_match": "0.6200",
47
- "korea_cg_bleu": "0.0865",
48
- "kornli_exact_match": "0.6400",
49
- "korsts_pearson": "0.8547",
50
- "korsts_spearman": "0.8464"
51
- }
52
-
53
  openai/gpt-4 : 0.6158
54
  gemini-pro: 0.515
55
  OpenCarrot-Mix-7B (this) : 0.4425
@@ -57,19 +36,40 @@ mistralai/Mixtral-8x7B-Instruct-v0.1 : 0.4304
57
  openai/gpt-3.5-turbo : 0.4217
58
  ```
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
61
- ```
62
- // use LogicKor
63
- ์นดํ…Œ๊ณ ๋ฆฌ: ์ฝ”๋”ฉ(Coding), ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท : 7.71, ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท : 7.71
64
- ์นดํ…Œ๊ณ ๋ฆฌ: ์ˆ˜ํ•™(Math), ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท : 5.57, ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท : 3.86
65
- ์นดํ…Œ๊ณ ๋ฆฌ: ์ดํ•ด(Understanding), ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท : 6.86, ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท : 8.14
66
- ์นดํ…Œ๊ณ ๋ฆฌ: ์ถ”๋ก (Reasoning), ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท : 8.14, ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท : 6.43
67
- ์นดํ…Œ๊ณ ๋ฆฌ: ๊ธ€์“ฐ๊ธฐ(Writing), ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท : 8.71, ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท : 6.86
68
- ์นดํ…Œ๊ณ ๋ฆฌ: ๋ฌธ๋ฒ•(Grammar), ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท : 5.29, ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท : 2.29
69
- ์ „์ฒด ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท : 7.05
70
- ์ „์ฒด ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท : 5.88
71
- ```
72
 
 
 
 
73
 
74
  ### Configuration
75
 
 
29
 
30
  ### Score
31
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  openai/gpt-4 : 0.6158
33
  gemini-pro: 0.515
34
  OpenCarrot-Mix-7B (this) : 0.4425
 
36
  openai/gpt-3.5-turbo : 0.4217
37
  ```
38
 
39
+ | ํ‰๊ฐ€ ์ง€ํ‘œ | ์ ์ˆ˜ |
40
+ |--------------|---------|
41
+ | AVG_llm_kr_eval | 0.4425 |
42
+ | EL | 0.0522 |
43
+ | FA | 0.0865 |
44
+ | NLI | 0.6700 |
45
+ | QA | 0.5100 |
46
+ | RC | 0.8937 |
47
+ | klue_ner_set_f1| 0.0944 |
48
+ | klue_re_exact_match | 0.0100 |
49
+ | kmmlu_preview_exact_match | 0.4000 |
50
+ | kobest_copa_exact_match | 0.8200 |
51
+ | kobest_hs_exact_match | 0.5500 |
52
+ | kobest_sn_exact_match | 0.9800 |
53
+ | kobest_wic_exact_match | 0.6200 |
54
+ | korea_cg_bleu | 0.0865 |
55
+ | kornli_exact_match | 0.6400 |
56
+ | korsts_pearson | 0.8547 |
57
+ | korsts_spearman| 0.8464 |
58
 
59
+ LogicKor
60
+
61
+ | ์นดํ…Œ๊ณ ๋ฆฌ | ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท  | ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท  |
62
+ |----------|------------------|-------------------|
63
+ | ์ฝ”๋”ฉ(Coding) | 7.71 | 7.71 |
64
+ | ์ˆ˜ํ•™(Math) | 5.57 | 3.86 |
65
+ | ์ดํ•ด(Understanding) | 6.86 | 8.14 |
66
+ | ์ถ”๋ก (Reasoning) | 8.14 | 6.43 |
67
+ | ๊ธ€์“ฐ๊ธฐ(Writing) | 8.71 | 6.86 |
68
+ | ๋ฌธ๋ฒ•(Grammar) | 5.29 | 2.29 |
 
69
 
70
+ | ์นดํ…Œ๊ณ ๋ฆฌ | ์‹ฑ๊ธ€ ์ ์ˆ˜ ํ‰๊ท  | ๋ฉ€ํ‹ฐ ์ ์ˆ˜ ํ‰๊ท  |
71
+ |------------|------------------|-------------------|
72
+ | ์ „์ฒด ์‹ฑ๊ธ€ | 7.05 | 5.88 |
73
 
74
  ### Configuration
75