leaderboard-pr-bot commited on
Commit
59260f1
1 Parent(s): a0c0f4f

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +121 -5
README.md CHANGED
@@ -1,14 +1,117 @@
1
  ---
2
- license: apache-2.0
3
  language:
4
  - en
5
- datasets:
6
- - Intel/orca_dpo_pairs
7
- pipeline_tag: conversational
8
  library_name: peft
9
  tags:
10
  - llm
11
  - 7b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
  # Jaskier 7b DPO V2
14
 
@@ -73,4 +176,17 @@ print(sequences[0])
73
 
74
  At bards.ai, we focus on providing machine learning expertise and skills to our partners, particularly in the areas of nlp, machine vision and time series analysis. Our team is located in Wroclaw, Poland. Please visit our website for more information: bards.ai
75
 
76
- Let us know if you use our model :). Also, if you need any help, feel free to contact us at info@bards.ai
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - en
4
+ license: apache-2.0
 
 
5
  library_name: peft
6
  tags:
7
  - llm
8
  - 7b
9
+ datasets:
10
+ - Intel/orca_dpo_pairs
11
+ pipeline_tag: conversational
12
+ model-index:
13
+ - name: jaskier-7b-dpo-v2
14
+ results:
15
+ - task:
16
+ type: text-generation
17
+ name: Text Generation
18
+ dataset:
19
+ name: AI2 Reasoning Challenge (25-Shot)
20
+ type: ai2_arc
21
+ config: ARC-Challenge
22
+ split: test
23
+ args:
24
+ num_few_shot: 25
25
+ metrics:
26
+ - type: acc_norm
27
+ value: 69.28
28
+ name: normalized accuracy
29
+ source:
30
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=bardsai/jaskier-7b-dpo-v2
31
+ name: Open LLM Leaderboard
32
+ - task:
33
+ type: text-generation
34
+ name: Text Generation
35
+ dataset:
36
+ name: HellaSwag (10-Shot)
37
+ type: hellaswag
38
+ split: validation
39
+ args:
40
+ num_few_shot: 10
41
+ metrics:
42
+ - type: acc_norm
43
+ value: 86.8
44
+ name: normalized accuracy
45
+ source:
46
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=bardsai/jaskier-7b-dpo-v2
47
+ name: Open LLM Leaderboard
48
+ - task:
49
+ type: text-generation
50
+ name: Text Generation
51
+ dataset:
52
+ name: MMLU (5-Shot)
53
+ type: cais/mmlu
54
+ config: all
55
+ split: test
56
+ args:
57
+ num_few_shot: 5
58
+ metrics:
59
+ - type: acc
60
+ value: 64.92
61
+ name: accuracy
62
+ source:
63
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=bardsai/jaskier-7b-dpo-v2
64
+ name: Open LLM Leaderboard
65
+ - task:
66
+ type: text-generation
67
+ name: Text Generation
68
+ dataset:
69
+ name: TruthfulQA (0-shot)
70
+ type: truthful_qa
71
+ config: multiple_choice
72
+ split: validation
73
+ args:
74
+ num_few_shot: 0
75
+ metrics:
76
+ - type: mc2
77
+ value: 61.64
78
+ source:
79
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=bardsai/jaskier-7b-dpo-v2
80
+ name: Open LLM Leaderboard
81
+ - task:
82
+ type: text-generation
83
+ name: Text Generation
84
+ dataset:
85
+ name: Winogrande (5-shot)
86
+ type: winogrande
87
+ config: winogrande_xl
88
+ split: validation
89
+ args:
90
+ num_few_shot: 5
91
+ metrics:
92
+ - type: acc
93
+ value: 80.74
94
+ name: accuracy
95
+ source:
96
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=bardsai/jaskier-7b-dpo-v2
97
+ name: Open LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: GSM8k (5-shot)
103
+ type: gsm8k
104
+ config: main
105
+ split: test
106
+ args:
107
+ num_few_shot: 5
108
+ metrics:
109
+ - type: acc
110
+ value: 71.8
111
+ name: accuracy
112
+ source:
113
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=bardsai/jaskier-7b-dpo-v2
114
+ name: Open LLM Leaderboard
115
  ---
116
  # Jaskier 7b DPO V2
117
 
 
176
 
177
  At bards.ai, we focus on providing machine learning expertise and skills to our partners, particularly in the areas of nlp, machine vision and time series analysis. Our team is located in Wroclaw, Poland. Please visit our website for more information: bards.ai
178
 
179
+ Let us know if you use our model :). Also, if you need any help, feel free to contact us at info@bards.ai
180
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
181
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_bardsai__jaskier-7b-dpo-v2)
182
+
183
+ | Metric |Value|
184
+ |---------------------------------|----:|
185
+ |Avg. |72.53|
186
+ |AI2 Reasoning Challenge (25-Shot)|69.28|
187
+ |HellaSwag (10-Shot) |86.80|
188
+ |MMLU (5-Shot) |64.92|
189
+ |TruthfulQA (0-shot) |61.64|
190
+ |Winogrande (5-shot) |80.74|
191
+ |GSM8k (5-shot) |71.80|
192
+