rombodawg commited on
Commit
8f11f70
1 Parent(s): dcdacfe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +217 -11
README.md CHANGED
@@ -1,22 +1,228 @@
1
  ---
2
- base_model: NousResearch/Meta-Llama-3-8B
3
- language:
4
- - en
5
- license: apache-2.0
6
  tags:
7
  - text-generation-inference
8
  - transformers
9
  - unsloth
10
  - llama
11
- - trl
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
 
 
 
 
 
13
 
14
- # Uploaded model
 
 
 
 
15
 
16
- - **Developed by:** Replete-AI
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** NousResearch/Meta-Llama-3-8B
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: other
3
+ license_name: llama-3
4
+ license_link: https://llama.meta.com/llama3/license/
 
5
  tags:
6
  - text-generation-inference
7
  - transformers
8
  - unsloth
9
  - llama
10
+ datasets:
11
+ - Replete-AI/code_bagel_hermes-2.5
12
+ - Replete-AI/code_bagel
13
+ - Replete-AI/OpenHermes-2.5-Uncensored
14
+ - teknium/OpenHermes-2.5
15
+ - layoric/tiny-codes-alpaca
16
+ - glaiveai/glaive-code-assistant-v3
17
+ - ajibawa-2023/Code-290k-ShareGPT
18
+ - TIGER-Lab/MathInstruct
19
+ - chargoddard/commitpack-ft-instruct-rated
20
+ - iamturun/code_instructions_120k_alpaca
21
+ - ise-uiuc/Magicoder-Evol-Instruct-110K
22
+ - cognitivecomputations/dolphin-coder
23
+ - nickrosh/Evol-Instruct-Code-80k-v1
24
+ - coseal/CodeUltraFeedback_binarized
25
+ - glaiveai/glaive-function-calling-v2
26
+ - CyberNative/Code_Vulnerability_Security_DPO
27
+ - jondurbin/airoboros-2.2
28
+ - camel-ai
29
+ - lmsys/lmsys-chat-1m
30
+ - CollectiveCognition/chats-data-2023-09-22
31
+ - CoT-Alpaca-GPT4
32
+ - WizardLM/WizardLM_evol_instruct_70k
33
+ - WizardLM/WizardLM_evol_instruct_V2_196k
34
+ - teknium/GPT4-LLM-Cleaned
35
+ - GPTeacher
36
+ - OpenGPT
37
+ - meta-math/MetaMathQA
38
+ - Open-Orca/SlimOrca
39
+ - garage-bAInd/Open-Platypus
40
+ - anon8231489123/ShareGPT_Vicuna_unfiltered
41
+ - Unnatural-Instructions-GPT4
42
+ model-index:
43
+ - name: Replete-Coder-llama3-8b
44
+ results:
45
+ - task:
46
+ name: HumanEval
47
+ type: text-generation
48
+ dataset:
49
+ type: openai_humaneval
50
+ name: HumanEval
51
+ metrics:
52
+ - name: pass@1
53
+ type: pass@1
54
+ value:
55
+ verified: false
56
+ - task:
57
+ name: AI2 Reasoning Challenge
58
+ type: text-generation
59
+ dataset:
60
+ name: AI2 Reasoning Challenge (25-Shot)
61
+ type: ai2_arc
62
+ config: ARC-Challenge
63
+ split: test
64
+ args:
65
+ num_few_shot: 25
66
+ metrics:
67
+ - type: accuracy
68
+ value:
69
+ name: normalized accuracy
70
+ source:
71
+ url: https://www.placeholderurl.com
72
+ name: Open LLM Leaderboard
73
+ - task:
74
+ name: Text Generation
75
+ type: text-generation
76
+ dataset:
77
+ name: HellaSwag (10-Shot)
78
+ type: hellaswag
79
+ split: validation
80
+ args:
81
+ num_few_shot: 10
82
+ metrics:
83
+ - type: accuracy
84
+ value:
85
+ name: normalized accuracy
86
+ source:
87
+ url: https://www.placeholderurl.com
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ name: Text Generation
91
+ type: text-generation
92
+ dataset:
93
+ name: MMLU (5-Shot)
94
+ type: cais/mmlu
95
+ config: all
96
+ split: test
97
+ args:
98
+ num_few_shot: 5
99
+ metrics:
100
+ - type: accuracy
101
+ value:
102
+ name: accuracy
103
+ source:
104
+ url: https://www.placeholderurl.com
105
+ name: Open LLM Leaderboard
106
+ - task:
107
+ name: Text Generation
108
+ type: text-generation
109
+ dataset:
110
+ name: TruthfulQA (0-shot)
111
+ type: truthful_qa
112
+ config: multiple_choice
113
+ split: validation
114
+ args:
115
+ num_few_shot: 0
116
+ metrics:
117
+ - type: multiple_choice_accuracy
118
+ value:
119
+ source:
120
+ url: https://www.placeholderurl.com
121
+ name: Open LLM Leaderboard
122
+ - task:
123
+ name: Text Generation
124
+ type: text-generation
125
+ dataset:
126
+ name: Winogrande (5-shot)
127
+ type: winogrande
128
+ config: winogrande_xl
129
+ split: validation
130
+ args:
131
+ num_few_shot: 5
132
+ metrics:
133
+ - type: accuracy
134
+ value:
135
+ name: accuracy
136
+ source:
137
+ url: https://www.placeholderurl.com
138
+ name: Open LLM Leaderboard
139
+ - task:
140
+ name: Text Generation
141
+ type: text-generation
142
+ dataset:
143
+ name: GSM8k (5-shot)
144
+ type: gsm8k
145
+ config: main
146
+ split: test
147
+ args:
148
+ num_few_shot: 5
149
+ metrics:
150
+ - type: accuracy
151
+ value:
152
+ name: accuracy
153
+ source:
154
+ url: https://www.placeholderurl.com
155
+ name: Open LLM Leaderboard
156
  ---
157
+ # Replete-Coder-llama3-8b
158
+ Finetuned by: Rombodawg
159
+ ### More than just a coding model!
160
+ Although Replete-Coder has amazing coding capabilities, its trained on vaste amount of non-coding data, fully cleaned and uncensored. Dont just use it for coding, use it for all your needs! We are truly trying to make the GPT killer!
161
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/-0dERC793D9XeFsJ9uHbx.png)
162
 
163
+ Thank you to TensorDock for sponsoring Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b
164
+ you can check out their website for cloud compute rental bellow.
165
+ - https://tensordock.com
166
+ __________________________________________________________________________________________________
167
+ Replete-Coder-llama3-8b is a general purpose model that is specially trained in coding in over 100 coding languages. The data used to train the model contains 25% non-code instruction data and 75% coding instruction data totaling up to 3.9 million lines, roughly 1 billion tokens, or 7.27gb of instruct data. The data used to train this model was 100% uncensored, then fully deduplicated, before training happened.
168
 
169
+ The Replete-Coder models (including Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b) feature the following:
 
 
170
 
171
+ - Advanced coding capabilities in over 100 coding languages
172
+ - Advanced code translation (between languages)
173
+ - Security and vulnerability prevention related coding capabilities
174
+ - General purpose use
175
+ - Uncensored use
176
+ - Function calling
177
+ - Advanced math use
178
+ - Use on low end (8b) and mobile (1.5b) platforms
179
 
180
+ Notice: Replete-Coder series of models are fine-tuned on a context window of 8192 tokens. Performance past this context window is not guaranteed.
181
+ __________________________________________________________________________________________________
182
+ You can find the 25% non-coding instruction below:
183
+
184
+ - https://huggingface.co/datasets/Replete-AI/OpenHermes-2.5-Uncensored
185
+
186
+ And the 75% coding specific instruction data below:
187
+
188
+ - https://huggingface.co/datasets/Replete-AI/code_bagel
189
+
190
+ These two datasets were combined to create the final dataset for training, which is linked below:
191
+
192
+ - https://huggingface.co/datasets/Replete-AI/code_bagel_hermes-2.5
193
+ __________________________________________________________________________________________________
194
+ ## Prompt Template: Custom Alpaca
195
+ ```
196
+ ### System:
197
+ {}
198
+
199
+ ### Instruction:
200
+ {}
201
+
202
+ ### Response:
203
+ {}
204
+ ```
205
+ Note: The system prompt varies in training data, but the most commonly used one is:
206
+ ```
207
+ Below is an instruction that describes a task, Write a response that appropriately completes the request.
208
+ ```
209
+ End token:
210
+ ```
211
+ <|endoftext|>
212
+ ```
213
+ __________________________________________________________________________________________________
214
+ Thank you to the community for your contributions to the Replete-AI/code_bagel_hermes-2.5 dataset. Without the participation of so many members making their datasets free and open source for any to use, this amazing AI model wouldn't be possible.
215
+
216
+ Extra special thanks to Teknium for the Open-Hermes-2.5 dataset and jondurbin for the bagel dataset and the naming idea for the code_bagel series of datasets. You can find both of their huggingface accounts linked below:
217
+
218
+ - https://huggingface.co/teknium
219
+ - https://huggingface.co/jondurbin
220
+
221
+ Another special thanks to unsloth for being the main method of training for Replete-Coder. Bellow you can find their github, as well as the special Replete-Ai secret sause (Unsloth + Qlora + Galore) colab code document that was used to train this model.
222
+
223
+ - https://github.com/unslothai/unsloth
224
+ - https://colab.research.google.com/drive/1VAaxMQJN9-78WLsPU0GWg5tEkasXoTP9?usp=sharing
225
+ __________________________________________________________________________________________________
226
+ ## Join the Replete-Ai discord! We are a great and Loving community!
227
+
228
+ - https://discord.gg/ZZbnsmVnjD