notadib leaderboard-pr-bot commited on
Commit
ac1b59f
1 Parent(s): 464534c

Adding Evaluation Results (#1)

Browse files

- Adding Evaluation Results (2bbf7c8ad0ee9bbaaeba118b82921549171f8fec)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +118 -2
README.md CHANGED
@@ -1,9 +1,112 @@
1
  ---
2
  license: apache-2.0
3
- pipeline_tag: text-generation
4
  tags:
5
  - finetuned
 
6
  inference: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ---
8
 
9
  # Model Card for Mistral-7B-Instruct-v0.2
@@ -82,4 +185,17 @@ make the model finely respect guardrails, allowing for deployment in environment
82
 
83
  ## The Mistral AI Team
84
 
85
- Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Louis Ternon, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
 
3
  tags:
4
  - finetuned
5
+ pipeline_tag: text-generation
6
  inference: false
7
+ model-index:
8
+ - name: Mistral-7B-Instruct-v0.2-attention-sparsity-30
9
+ results:
10
+ - task:
11
+ type: text-generation
12
+ name: Text Generation
13
+ dataset:
14
+ name: AI2 Reasoning Challenge (25-Shot)
15
+ type: ai2_arc
16
+ config: ARC-Challenge
17
+ split: test
18
+ args:
19
+ num_few_shot: 25
20
+ metrics:
21
+ - type: acc_norm
22
+ value: 62.97
23
+ name: normalized accuracy
24
+ source:
25
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=notadib/Mistral-7B-Instruct-v0.2-attention-sparsity-30
26
+ name: Open LLM Leaderboard
27
+ - task:
28
+ type: text-generation
29
+ name: Text Generation
30
+ dataset:
31
+ name: HellaSwag (10-Shot)
32
+ type: hellaswag
33
+ split: validation
34
+ args:
35
+ num_few_shot: 10
36
+ metrics:
37
+ - type: acc_norm
38
+ value: 84.71
39
+ name: normalized accuracy
40
+ source:
41
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=notadib/Mistral-7B-Instruct-v0.2-attention-sparsity-30
42
+ name: Open LLM Leaderboard
43
+ - task:
44
+ type: text-generation
45
+ name: Text Generation
46
+ dataset:
47
+ name: MMLU (5-Shot)
48
+ type: cais/mmlu
49
+ config: all
50
+ split: test
51
+ args:
52
+ num_few_shot: 5
53
+ metrics:
54
+ - type: acc
55
+ value: 60.49
56
+ name: accuracy
57
+ source:
58
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=notadib/Mistral-7B-Instruct-v0.2-attention-sparsity-30
59
+ name: Open LLM Leaderboard
60
+ - task:
61
+ type: text-generation
62
+ name: Text Generation
63
+ dataset:
64
+ name: TruthfulQA (0-shot)
65
+ type: truthful_qa
66
+ config: multiple_choice
67
+ split: validation
68
+ args:
69
+ num_few_shot: 0
70
+ metrics:
71
+ - type: mc2
72
+ value: 67.49
73
+ source:
74
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=notadib/Mistral-7B-Instruct-v0.2-attention-sparsity-30
75
+ name: Open LLM Leaderboard
76
+ - task:
77
+ type: text-generation
78
+ name: Text Generation
79
+ dataset:
80
+ name: Winogrande (5-shot)
81
+ type: winogrande
82
+ config: winogrande_xl
83
+ split: validation
84
+ args:
85
+ num_few_shot: 5
86
+ metrics:
87
+ - type: acc
88
+ value: 77.98
89
+ name: accuracy
90
+ source:
91
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=notadib/Mistral-7B-Instruct-v0.2-attention-sparsity-30
92
+ name: Open LLM Leaderboard
93
+ - task:
94
+ type: text-generation
95
+ name: Text Generation
96
+ dataset:
97
+ name: GSM8k (5-shot)
98
+ type: gsm8k
99
+ config: main
100
+ split: test
101
+ args:
102
+ num_few_shot: 5
103
+ metrics:
104
+ - type: acc
105
+ value: 39.42
106
+ name: accuracy
107
+ source:
108
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=notadib/Mistral-7B-Instruct-v0.2-attention-sparsity-30
109
+ name: Open LLM Leaderboard
110
  ---
111
 
112
  # Model Card for Mistral-7B-Instruct-v0.2
 
185
 
186
  ## The Mistral AI Team
187
 
188
+ Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Louis Ternon, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.
189
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
190
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_notadib__Mistral-7B-Instruct-v0.2-attention-sparsity-30)
191
+
192
+ | Metric |Value|
193
+ |---------------------------------|----:|
194
+ |Avg. |65.51|
195
+ |AI2 Reasoning Challenge (25-Shot)|62.97|
196
+ |HellaSwag (10-Shot) |84.71|
197
+ |MMLU (5-Shot) |60.49|
198
+ |TruthfulQA (0-shot) |67.49|
199
+ |Winogrande (5-shot) |77.98|
200
+ |GSM8k (5-shot) |39.42|
201
+