satyamt commited on
Commit
c4c87ff
1 Parent(s): 0dcab31

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +119 -1
README.md CHANGED
@@ -9,4 +9,122 @@ language:
9
  tags:
10
  - medical
11
  ---
12
- [Technoculture/MD7b-alpha](https://huggingface.co/Technoculture/MD7b-alpha) adapter merged with its Base Model (Meditron 7B)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  tags:
10
  - medical
11
  ---
12
+ [Technoculture/MD7b-alpha](https://huggingface.co/Technoculture/MD7b-alpha) adapter merged with its Base Model (Meditron 7B)
13
+
14
+ ---
15
+
16
+ | Model | ARC |HellaSwag| MMLU |TruthfulQA|Winogrande|GSM8K|
17
+ |---------------------------------------------------|----:|--------:|--------------------------|---------:|---------:|----:|
18
+ |[MT7Bi](https://huggingface.co/Technoculture/MT7Bi)|50.94| 73.24|Error: File does not exist| 43.04| 72.06|22.52|
19
+
20
+ ### ARC
21
+ | Task |Version| Metric | Value | |Stderr|
22
+ |-------------|-------|--------------------|-------------|---|------|
23
+ |arc_challenge|Yaml |acc,none | 0.48| | |
24
+ | | |acc_stderr,none | 0.01| | |
25
+ | | |acc_norm,none | 0.51| | |
26
+ | | |acc_norm_stderr,none| 0.01| | |
27
+ | | |alias |arc_challenge| | |
28
+
29
+ Average: 50.94%
30
+
31
+ ### HellaSwag
32
+ | Task |Version| Metric | Value | |Stderr|
33
+ |---------|-------|--------------------|---------|---|------|
34
+ |hellaswag|Yaml |acc,none | 0.54| | |
35
+ | | |acc_stderr,none | 0| | |
36
+ | | |acc_norm,none | 0.73| | |
37
+ | | |acc_norm_stderr,none| 0| | |
38
+ | | |alias |hellaswag| | |
39
+
40
+ Average: 73.24%
41
+
42
+ ### MMLU
43
+
44
+ Average: Error: File does not exist%
45
+
46
+ ### TruthfulQA
47
+ | Task |Version| Metric | Value | |Stderr|
48
+ |--------------|-------|-----------------------|-----------------|---|------|
49
+ |truthfulqa |N/A |bleu_max,none | 16.17| | |
50
+ | | |bleu_max_stderr,none | 0.38| | |
51
+ | | |bleu_acc,none | 0.36| | |
52
+ | | |bleu_acc_stderr,none | 0| | |
53
+ | | |bleu_diff,none | -2.78| | |
54
+ | | |bleu_diff_stderr,none | 0.26| | |
55
+ | | |rouge1_max,none | 39.99| | |
56
+ | | |rouge1_max_stderr,none | 0.64| | |
57
+ | | |rouge1_acc,none | 0.36| | |
58
+ | | |rouge1_acc_stderr,none | 0| | |
59
+ | | |rouge1_diff,none | -4.19| | |
60
+ | | |rouge1_diff_stderr,none| 0.45| | |
61
+ | | |rouge2_max,none | 24.52| | |
62
+ | | |rouge2_max_stderr,none | 0.68| | |
63
+ | | |rouge2_acc,none | 0.29| | |
64
+ | | |rouge2_acc_stderr,none | 0| | |
65
+ | | |rouge2_diff,none | -4.90| | |
66
+ | | |rouge2_diff_stderr,none| 0.55| | |
67
+ | | |rougeL_max,none | 36.52| | |
68
+ | | |rougeL_max_stderr,none | 0.64| | |
69
+ | | |rougeL_acc,none | 0.33| | |
70
+ | | |rougeL_acc_stderr,none | 0| | |
71
+ | | |rougeL_diff,none | -4.56| | |
72
+ | | |rougeL_diff_stderr,none| 0.45| | |
73
+ | | |acc,none | 0.33| | |
74
+ | | |acc_stderr,none | 0.05| | |
75
+ | | |alias |truthfulqa | | |
76
+ |truthfulqa_gen|Yaml |bleu_max,none | 16.17| | |
77
+ | | |bleu_max_stderr,none | 0.61| | |
78
+ | | |bleu_acc,none | 0.36| | |
79
+ | | |bleu_acc_stderr,none | 0.02| | |
80
+ | | |bleu_diff,none | -2.78| | |
81
+ | | |bleu_diff_stderr,none | 0.51| | |
82
+ | | |rouge1_max,none | 39.99| | |
83
+ | | |rouge1_max_stderr,none | 0.80| | |
84
+ | | |rouge1_acc,none | 0.36| | |
85
+ | | |rouge1_acc_stderr,none | 0.02| | |
86
+ | | |rouge1_diff,none | -4.19| | |
87
+ | | |rouge1_diff_stderr,none| 0.67| | |
88
+ | | |rouge2_max,none | 24.52| | |
89
+ | | |rouge2_max_stderr,none | 0.83| | |
90
+ | | |rouge2_acc,none | 0.29| | |
91
+ | | |rouge2_acc_stderr,none | 0.02| | |
92
+ | | |rouge2_diff,none | -4.90| | |
93
+ | | |rouge2_diff_stderr,none| 0.74| | |
94
+ | | |rougeL_max,none | 36.52| | |
95
+ | | |rougeL_max_stderr,none | 0.80| | |
96
+ | | |rougeL_acc,none | 0.33| | |
97
+ | | |rougeL_acc_stderr,none | 0.02| | |
98
+ | | |rougeL_diff,none | -4.56| | |
99
+ | | |rougeL_diff_stderr,none| 0.67| | |
100
+ | | |alias | - truthfulqa_gen| | |
101
+ |truthfulqa_mc1|Yaml |acc,none | 0.28| | |
102
+ | | |acc_stderr,none | 0.02| | |
103
+ | | |alias | - truthfulqa_mc1| | |
104
+ |truthfulqa_mc2|Yaml |acc,none | 0.43| | |
105
+ | | |acc_stderr,none | 0.01| | |
106
+ | | |alias | - truthfulqa_mc2| | |
107
+
108
+ Average: 43.04%
109
+
110
+ ### Winogrande
111
+ | Task |Version| Metric | Value | |Stderr|
112
+ |----------|-------|---------------|----------|---|------|
113
+ |winogrande|Yaml |acc,none | 0.72| | |
114
+ | | |acc_stderr,none| 0.01| | |
115
+ | | |alias |winogrande| | |
116
+
117
+ Average: 72.06%
118
+
119
+ ### GSM8K
120
+ |Task |Version| Metric |Value| |Stderr|
121
+ |-----|-------|-----------------------------|-----|---|------|
122
+ |gsm8k|Yaml |exact_match,get-answer | 0.23| | |
123
+ | | |exact_match_stderr,get-answer| 0.01| | |
124
+ | | |alias |gsm8k| | |
125
+
126
+ Average: 22.52%
127
+
128
+ Average score: Not available due to errors
129
+
130
+ Elapsed time: 03:56:55