Text Generation
Transformers
text-generation-inference
unsloth
llama
Eval Results
Inference Endpoints
bartowski commited on
Commit
cbff6ea
1 Parent(s): e8891d1

measurement.json

Browse files
Files changed (2) hide show
  1. README.md +222 -0
  2. measurement.json +0 -0
README.md ADDED
@@ -0,0 +1,222 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: llama-3
4
+ license_link: https://llama.meta.com/llama3/license/
5
+ tags:
6
+ - text-generation-inference
7
+ - transformers
8
+ - unsloth
9
+ - llama
10
+ datasets:
11
+ - Replete-AI/code_bagel_hermes-2.5
12
+ - Replete-AI/code_bagel
13
+ - Replete-AI/OpenHermes-2.5-Uncensored
14
+ - teknium/OpenHermes-2.5
15
+ - layoric/tiny-codes-alpaca
16
+ - glaiveai/glaive-code-assistant-v3
17
+ - ajibawa-2023/Code-290k-ShareGPT
18
+ - TIGER-Lab/MathInstruct
19
+ - chargoddard/commitpack-ft-instruct-rated
20
+ - iamturun/code_instructions_120k_alpaca
21
+ - ise-uiuc/Magicoder-Evol-Instruct-110K
22
+ - cognitivecomputations/dolphin-coder
23
+ - nickrosh/Evol-Instruct-Code-80k-v1
24
+ - coseal/CodeUltraFeedback_binarized
25
+ - glaiveai/glaive-function-calling-v2
26
+ - CyberNative/Code_Vulnerability_Security_DPO
27
+ - jondurbin/airoboros-2.2
28
+ - camel-ai
29
+ - lmsys/lmsys-chat-1m
30
+ - CollectiveCognition/chats-data-2023-09-22
31
+ - CoT-Alpaca-GPT4
32
+ - WizardLM/WizardLM_evol_instruct_70k
33
+ - WizardLM/WizardLM_evol_instruct_V2_196k
34
+ - teknium/GPT4-LLM-Cleaned
35
+ - GPTeacher
36
+ - OpenGPT
37
+ - meta-math/MetaMathQA
38
+ - Open-Orca/SlimOrca
39
+ - garage-bAInd/Open-Platypus
40
+ - anon8231489123/ShareGPT_Vicuna_unfiltered
41
+ - Unnatural-Instructions-GPT4
42
+ model-index:
43
+ - name: Replete-Coder-llama3-8b
44
+ results:
45
+ - task:
46
+ name: HumanEval
47
+ type: text-generation
48
+ dataset:
49
+ type: openai_humaneval
50
+ name: HumanEval
51
+ metrics:
52
+ - name: pass@1
53
+ type: pass@1
54
+ value:
55
+ verified: false
56
+ - task:
57
+ name: AI2 Reasoning Challenge
58
+ type: text-generation
59
+ dataset:
60
+ name: AI2 Reasoning Challenge (25-Shot)
61
+ type: ai2_arc
62
+ config: ARC-Challenge
63
+ split: test
64
+ args:
65
+ num_few_shot: 25
66
+ metrics:
67
+ - type: accuracy
68
+ value:
69
+ name: normalized accuracy
70
+ source:
71
+ url: https://www.placeholderurl.com
72
+ name: Open LLM Leaderboard
73
+ - task:
74
+ name: Text Generation
75
+ type: text-generation
76
+ dataset:
77
+ name: HellaSwag (10-Shot)
78
+ type: hellaswag
79
+ split: validation
80
+ args:
81
+ num_few_shot: 10
82
+ metrics:
83
+ - type: accuracy
84
+ value:
85
+ name: normalized accuracy
86
+ source:
87
+ url: https://www.placeholderurl.com
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ name: Text Generation
91
+ type: text-generation
92
+ dataset:
93
+ name: MMLU (5-Shot)
94
+ type: cais/mmlu
95
+ config: all
96
+ split: test
97
+ args:
98
+ num_few_shot: 5
99
+ metrics:
100
+ - type: accuracy
101
+ value:
102
+ name: accuracy
103
+ source:
104
+ url: https://www.placeholderurl.com
105
+ name: Open LLM Leaderboard
106
+ - task:
107
+ name: Text Generation
108
+ type: text-generation
109
+ dataset:
110
+ name: TruthfulQA (0-shot)
111
+ type: truthful_qa
112
+ config: multiple_choice
113
+ split: validation
114
+ args:
115
+ num_few_shot: 0
116
+ metrics:
117
+ - type: multiple_choice_accuracy
118
+ value:
119
+ source:
120
+ url: https://www.placeholderurl.com
121
+ name: Open LLM Leaderboard
122
+ - task:
123
+ name: Text Generation
124
+ type: text-generation
125
+ dataset:
126
+ name: Winogrande (5-shot)
127
+ type: winogrande
128
+ config: winogrande_xl
129
+ split: validation
130
+ args:
131
+ num_few_shot: 5
132
+ metrics:
133
+ - type: accuracy
134
+ value:
135
+ name: accuracy
136
+ source:
137
+ url: https://www.placeholderurl.com
138
+ name: Open LLM Leaderboard
139
+ - task:
140
+ name: Text Generation
141
+ type: text-generation
142
+ dataset:
143
+ name: GSM8k (5-shot)
144
+ type: gsm8k
145
+ config: main
146
+ split: test
147
+ args:
148
+ num_few_shot: 5
149
+ metrics:
150
+ - type: accuracy
151
+ value:
152
+ name: accuracy
153
+ source:
154
+ url: https://www.placeholderurl.com
155
+ name: Open LLM Leaderboard
156
+ quantized_by: bartowski
157
+ pipeline_tag: text-generation
158
+ ---
159
+
160
+ ## Exllama v2 Quantizations of Replete-Coder-Llama3-8B
161
+
162
+ Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.1.6">turboderp's ExLlamaV2 v0.1.6</a> for quantization.
163
+
164
+ <b>The "main" branch only contains the measurement.json, download one of the other branches for the model (see below)</b>
165
+
166
+ Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
167
+
168
+ Original model: https://huggingface.co/Replete-AI/Replete-Coder-Llama3-8B
169
+
170
+ ## Prompt format
171
+
172
+ No chat template specified so default is used. This may be incorrect, check original model card for details.
173
+
174
+ ```
175
+ <|im_start|>system
176
+ {system_prompt}<|im_end|>
177
+ <|im_start|>user
178
+ {prompt}<|im_end|>
179
+ <|im_start|>assistant
180
+
181
+ ```
182
+
183
+ ## Available sizes
184
+
185
+
186
+ | Branch | Bits | lm_head bits | VRAM (4k) | VRAM (8K) | VRAM (16k) | VRAM (32k) | Description |
187
+ | ----- | ---- | ------- | ------ | ------ | ------ | ------ | ------------ |
188
+ | [8_0](https://huggingface.co/bartowski/Replete-Coder-Llama3-8B-exl2/tree/8_0) | 8.0 | 8.0 | 10.1 GB | 10.5 GB | 11.5 GB | 13.6 GB | Maximum quality that ExLlamaV2 can produce, near unquantized performance. |
189
+ | [6_5](https://huggingface.co/bartowski/Replete-Coder-Llama3-8B-exl2/tree/6_5) | 6.5 | 8.0 | 8.9 GB | 9.3 GB | 10.3 GB | 12.4 GB | Very similar to 8.0, good tradeoff of size vs performance, **recommended**. |
190
+ | [5_0](https://huggingface.co/bartowski/Replete-Coder-Llama3-8B-exl2/tree/5_0) | 5.0 | 6.0 | 7.7 GB | 8.1 GB | 9.1 GB | 11.2 GB | Slightly lower quality vs 6.5, but usable on 8GB cards. |
191
+ | [4_25](https://huggingface.co/bartowski/Replete-Coder-Llama3-8B-exl2/tree/4_25) | 4.25 | 6.0 | 7.0 GB | 7.4 GB | 8.4 GB | 10.5 GB | GPTQ equivalent bits per weight, slightly higher quality. |
192
+ | [3_5](https://huggingface.co/bartowski/Replete-Coder-Llama3-8B-exl2/tree/3_5) | 3.5 | 6.0 | 6.4 GB | 6.8 GB | 7.8 GB | 9.9 GB | Lower quality, only use if you have to. |
193
+
194
+ ## Download instructions
195
+
196
+ With git:
197
+
198
+ ```shell
199
+ git clone --single-branch --branch 6_5 https://huggingface.co/bartowski/Replete-Coder-Llama3-8B-exl2 Replete-Coder-Llama3-8B-exl2-6_5
200
+ ```
201
+
202
+ With huggingface hub (credit to TheBloke for instructions):
203
+
204
+ ```shell
205
+ pip3 install huggingface-hub
206
+ ```
207
+
208
+ To download a specific branch, use the `--revision` parameter. For example, to download the 6.5 bpw branch:
209
+
210
+ Linux:
211
+
212
+ ```shell
213
+ huggingface-cli download bartowski/Replete-Coder-Llama3-8B-exl2 --revision 6_5 --local-dir Replete-Coder-Llama3-8B-exl2-6_5
214
+ ```
215
+
216
+ Windows (which apparently doesn't like _ in folders sometimes?):
217
+
218
+ ```shell
219
+ huggingface-cli download bartowski/Replete-Coder-Llama3-8B-exl2 --revision 6_5 --local-dir Replete-Coder-Llama3-8B-exl2-6.5
220
+ ```
221
+
222
+ Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski
measurement.json ADDED
The diff for this file is too large to render. See raw diff