leaderboard-pr-bot commited on
Commit
89f7907
1 Parent(s): 3af62c7

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +118 -1
README.md CHANGED
@@ -4,10 +4,10 @@ language:
4
  license: other
5
  tags:
6
  - uncensored
 
7
  datasets:
8
  - ehartford/wizard_vicuna_70k_unfiltered
9
  model_name: Wizard Vicuna 30B Uncensored
10
- base_model: ehartford/Wizard-Vicuna-30B-Uncensored
11
  inference: false
12
  model_creator: Eric Hartford
13
  model_type: llama
@@ -17,6 +17,109 @@ prompt_template: 'A chat between a curious user and an artificial intelligence a
17
 
18
  '
19
  quantized_by: TheBloke
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ---
21
 
22
  <!-- header start -->
@@ -321,3 +424,17 @@ You are responsible for anything you do with the model, just as you are responsi
321
  Publishing anything this model generates is the same as publishing it yourself.
322
 
323
  You are responsible for the content you publish, and you cannot blame the model any more than you can blame the knife, gun, lighter, or car for what you do with it.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  license: other
5
  tags:
6
  - uncensored
7
+ base_model: ehartford/Wizard-Vicuna-30B-Uncensored
8
  datasets:
9
  - ehartford/wizard_vicuna_70k_unfiltered
10
  model_name: Wizard Vicuna 30B Uncensored
 
11
  inference: false
12
  model_creator: Eric Hartford
13
  model_type: llama
 
17
 
18
  '
19
  quantized_by: TheBloke
20
+ model-index:
21
+ - name: Wizard-Vicuna-30B-Uncensored-GPTQ
22
+ results:
23
+ - task:
24
+ type: text-generation
25
+ name: Text Generation
26
+ dataset:
27
+ name: AI2 Reasoning Challenge (25-Shot)
28
+ type: ai2_arc
29
+ config: ARC-Challenge
30
+ split: test
31
+ args:
32
+ num_few_shot: 25
33
+ metrics:
34
+ - type: acc_norm
35
+ value: 61.09
36
+ name: normalized accuracy
37
+ source:
38
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ
39
+ name: Open LLM Leaderboard
40
+ - task:
41
+ type: text-generation
42
+ name: Text Generation
43
+ dataset:
44
+ name: HellaSwag (10-Shot)
45
+ type: hellaswag
46
+ split: validation
47
+ args:
48
+ num_few_shot: 10
49
+ metrics:
50
+ - type: acc_norm
51
+ value: 82.4
52
+ name: normalized accuracy
53
+ source:
54
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ
55
+ name: Open LLM Leaderboard
56
+ - task:
57
+ type: text-generation
58
+ name: Text Generation
59
+ dataset:
60
+ name: MMLU (5-Shot)
61
+ type: cais/mmlu
62
+ config: all
63
+ split: test
64
+ args:
65
+ num_few_shot: 5
66
+ metrics:
67
+ - type: acc
68
+ value: 56.46
69
+ name: accuracy
70
+ source:
71
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ
72
+ name: Open LLM Leaderboard
73
+ - task:
74
+ type: text-generation
75
+ name: Text Generation
76
+ dataset:
77
+ name: TruthfulQA (0-shot)
78
+ type: truthful_qa
79
+ config: multiple_choice
80
+ split: validation
81
+ args:
82
+ num_few_shot: 0
83
+ metrics:
84
+ - type: mc2
85
+ value: 49.9
86
+ source:
87
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ type: text-generation
91
+ name: Text Generation
92
+ dataset:
93
+ name: Winogrande (5-shot)
94
+ type: winogrande
95
+ config: winogrande_xl
96
+ split: validation
97
+ args:
98
+ num_few_shot: 5
99
+ metrics:
100
+ - type: acc
101
+ value: 77.66
102
+ name: accuracy
103
+ source:
104
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ
105
+ name: Open LLM Leaderboard
106
+ - task:
107
+ type: text-generation
108
+ name: Text Generation
109
+ dataset:
110
+ name: GSM8k (5-shot)
111
+ type: gsm8k
112
+ config: main
113
+ split: test
114
+ args:
115
+ num_few_shot: 5
116
+ metrics:
117
+ - type: acc
118
+ value: 23.28
119
+ name: accuracy
120
+ source:
121
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TheBloke/Wizard-Vicuna-30B-Uncensored-GPTQ
122
+ name: Open LLM Leaderboard
123
  ---
124
 
125
  <!-- header start -->
 
424
  Publishing anything this model generates is the same as publishing it yourself.
425
 
426
  You are responsible for the content you publish, and you cannot blame the model any more than you can blame the knife, gun, lighter, or car for what you do with it.
427
+
428
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
429
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_TheBloke__Wizard-Vicuna-30B-Uncensored-GPTQ)
430
+
431
+ | Metric |Value|
432
+ |---------------------------------|----:|
433
+ |Avg. |58.47|
434
+ |AI2 Reasoning Challenge (25-Shot)|61.09|
435
+ |HellaSwag (10-Shot) |82.40|
436
+ |MMLU (5-Shot) |56.46|
437
+ |TruthfulQA (0-shot) |49.90|
438
+ |Winogrande (5-shot) |77.66|
439
+ |GSM8k (5-shot) |23.28|
440
+