Adding Evaluation Results

#1
Files changed (1) hide show
  1. README.md +116 -1
README.md CHANGED
@@ -10,6 +10,109 @@ datasets:
10
  license_name: hsul
11
  license_link: https://huggingface.co/OEvortex/vortex-3b/raw/main/LICENSE.md
12
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
  ![Vortex 3b](https://cdn-lfs-us-1.huggingface.co/repos/68/cb/68cb18839210e9d774c72c739ef72fb95cb03f6f857e6b5a2377406f2078e65a/04b4c8104a551ec9754fd1169842ee67c06ced0fb16569b5fca804c2068578e9?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27vortex%25203b.png%3B+filename%3D%22vortex+3b.png%22%3B&response-content-type=image%2Fpng&Expires=1711282309&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcxMTI4MjMwOX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzY4L2NiLzY4Y2IxODgzOTIxMGU5ZDc3NGM3MmM3MzllZjcyZmI5NWNiMDNmNmY4NTdlNmI1YTIzNzc0MDZmMjA3OGU2NWEvMDRiNGM4MTA0YTU1MWVjOTc1NGZkMTE2OTg0MmVlNjdjMDZjZWQwZmIxNjU2OWI1ZmNhODA0YzIwNjg1NzhlOT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=T0d7vhvs9DjdMClzOOEyJMVe4e33u7A-9VTvApC3lMQgW7GXt-hOACXDWDp2TQMv6WoxL7DmGBh-d2uXvGT-2Sqx%7EUcce2pxq1yqtkMEi7sf2WYSHtmaXrIWAiVF%7EPeidG1wRr8wfUY3qIjlKtMVJs7RGbzvAgyvhscazuqutIj37tlIjEcYUXYOuYZoqA3OhoXfpiawCvmXc%7EAul-bAWwAYx91BWvGw9fw4tv20wisJDsh6BV7HEWnV%7EYbXvCxxlZZ4BbcWrYDN%7EMRf48EElCacf5KMpDMbCa52rO-ZvXCWgap%7EzUaIemRSQ84rpgTlVKb--D3GL3pUwRroaiRF7A__&Key-Pair-Id=KCD77M1F0VK2B)
15
  **Model Overview**
@@ -37,4 +140,16 @@ text = pipeline(model="OEvortex/vortex-3b-v2", torch_dtype=torch.bfloat16, devic
37
 
38
  res = text("Explain to me the difference between nuclear fission and fusion.")
39
  print(res[0]["text"])
40
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  license_name: hsul
11
  license_link: https://huggingface.co/OEvortex/vortex-3b/raw/main/LICENSE.md
12
  pipeline_tag: text-generation
13
+ model-index:
14
+ - name: vortex-3b-v2
15
+ results:
16
+ - task:
17
+ type: text-generation
18
+ name: Text Generation
19
+ dataset:
20
+ name: AI2 Reasoning Challenge (25-Shot)
21
+ type: ai2_arc
22
+ config: ARC-Challenge
23
+ split: test
24
+ args:
25
+ num_few_shot: 25
26
+ metrics:
27
+ - type: acc_norm
28
+ value: 39.68
29
+ name: normalized accuracy
30
+ source:
31
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OEvortex/vortex-3b-v2
32
+ name: Open LLM Leaderboard
33
+ - task:
34
+ type: text-generation
35
+ name: Text Generation
36
+ dataset:
37
+ name: HellaSwag (10-Shot)
38
+ type: hellaswag
39
+ split: validation
40
+ args:
41
+ num_few_shot: 10
42
+ metrics:
43
+ - type: acc_norm
44
+ value: 65.04
45
+ name: normalized accuracy
46
+ source:
47
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OEvortex/vortex-3b-v2
48
+ name: Open LLM Leaderboard
49
+ - task:
50
+ type: text-generation
51
+ name: Text Generation
52
+ dataset:
53
+ name: MMLU (5-Shot)
54
+ type: cais/mmlu
55
+ config: all
56
+ split: test
57
+ args:
58
+ num_few_shot: 5
59
+ metrics:
60
+ - type: acc
61
+ value: 25.09
62
+ name: accuracy
63
+ source:
64
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OEvortex/vortex-3b-v2
65
+ name: Open LLM Leaderboard
66
+ - task:
67
+ type: text-generation
68
+ name: Text Generation
69
+ dataset:
70
+ name: TruthfulQA (0-shot)
71
+ type: truthful_qa
72
+ config: multiple_choice
73
+ split: validation
74
+ args:
75
+ num_few_shot: 0
76
+ metrics:
77
+ - type: mc2
78
+ value: 33.8
79
+ source:
80
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OEvortex/vortex-3b-v2
81
+ name: Open LLM Leaderboard
82
+ - task:
83
+ type: text-generation
84
+ name: Text Generation
85
+ dataset:
86
+ name: Winogrande (5-shot)
87
+ type: winogrande
88
+ config: winogrande_xl
89
+ split: validation
90
+ args:
91
+ num_few_shot: 5
92
+ metrics:
93
+ - type: acc
94
+ value: 59.12
95
+ name: accuracy
96
+ source:
97
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OEvortex/vortex-3b-v2
98
+ name: Open LLM Leaderboard
99
+ - task:
100
+ type: text-generation
101
+ name: Text Generation
102
+ dataset:
103
+ name: GSM8k (5-shot)
104
+ type: gsm8k
105
+ config: main
106
+ split: test
107
+ args:
108
+ num_few_shot: 5
109
+ metrics:
110
+ - type: acc
111
+ value: 2.05
112
+ name: accuracy
113
+ source:
114
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=OEvortex/vortex-3b-v2
115
+ name: Open LLM Leaderboard
116
  ---
117
  ![Vortex 3b](https://cdn-lfs-us-1.huggingface.co/repos/68/cb/68cb18839210e9d774c72c739ef72fb95cb03f6f857e6b5a2377406f2078e65a/04b4c8104a551ec9754fd1169842ee67c06ced0fb16569b5fca804c2068578e9?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27vortex%25203b.png%3B+filename%3D%22vortex+3b.png%22%3B&response-content-type=image%2Fpng&Expires=1711282309&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcxMTI4MjMwOX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzY4L2NiLzY4Y2IxODgzOTIxMGU5ZDc3NGM3MmM3MzllZjcyZmI5NWNiMDNmNmY4NTdlNmI1YTIzNzc0MDZmMjA3OGU2NWEvMDRiNGM4MTA0YTU1MWVjOTc1NGZkMTE2OTg0MmVlNjdjMDZjZWQwZmIxNjU2OWI1ZmNhODA0YzIwNjg1NzhlOT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSomcmVzcG9uc2UtY29udGVudC10eXBlPSoifV19&Signature=T0d7vhvs9DjdMClzOOEyJMVe4e33u7A-9VTvApC3lMQgW7GXt-hOACXDWDp2TQMv6WoxL7DmGBh-d2uXvGT-2Sqx%7EUcce2pxq1yqtkMEi7sf2WYSHtmaXrIWAiVF%7EPeidG1wRr8wfUY3qIjlKtMVJs7RGbzvAgyvhscazuqutIj37tlIjEcYUXYOuYZoqA3OhoXfpiawCvmXc%7EAul-bAWwAYx91BWvGw9fw4tv20wisJDsh6BV7HEWnV%7EYbXvCxxlZZ4BbcWrYDN%7EMRf48EElCacf5KMpDMbCa52rO-ZvXCWgap%7EzUaIemRSQ84rpgTlVKb--D3GL3pUwRroaiRF7A__&Key-Pair-Id=KCD77M1F0VK2B)
118
  **Model Overview**
 
140
 
141
  res = text("Explain to me the difference between nuclear fission and fusion.")
142
  print(res[0]["text"])
143
+ ```
144
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
145
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_OEvortex__vortex-3b-v2)
146
+ | Metric |Value|
147
+ |---------------------------------|----:|
148
+ |Avg. |37.46|
149
+ |AI2 Reasoning Challenge (25-Shot)|39.68|
150
+ |HellaSwag (10-Shot) |65.04|
151
+ |MMLU (5-Shot) |25.09|
152
+ |TruthfulQA (0-shot) |33.80|
153
+ |Winogrande (5-shot) |59.12|
154
+ |GSM8k (5-shot) | 2.05|
155
+