Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,131 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
base_model: Replete-AI/Replete-Coder-Qwen2-1.5b
|
4 |
+
tags:
|
5 |
+
- text-generation-inference
|
6 |
+
- transformers
|
7 |
+
- unsloth
|
8 |
+
- qwen2
|
9 |
+
datasets:
|
10 |
+
- Replete-AI/code_bagel_hermes-2.5
|
11 |
+
- Replete-AI/code_bagel
|
12 |
+
- Replete-AI/OpenHermes-2.5-Uncensored
|
13 |
+
- teknium/OpenHermes-2.5
|
14 |
+
- layoric/tiny-codes-alpaca
|
15 |
+
- glaiveai/glaive-code-assistant-v3
|
16 |
+
- ajibawa-2023/Code-290k-ShareGPT
|
17 |
+
- TIGER-Lab/MathInstruct
|
18 |
+
- chargoddard/commitpack-ft-instruct-rated
|
19 |
+
- iamturun/code_instructions_120k_alpaca
|
20 |
+
- ise-uiuc/Magicoder-Evol-Instruct-110K
|
21 |
+
- cognitivecomputations/dolphin-coder
|
22 |
+
- nickrosh/Evol-Instruct-Code-80k-v1
|
23 |
+
- coseal/CodeUltraFeedback_binarized
|
24 |
+
- glaiveai/glaive-function-calling-v2
|
25 |
+
- CyberNative/Code_Vulnerability_Security_DPO
|
26 |
+
- jondurbin/airoboros-2.2
|
27 |
+
- camel-ai
|
28 |
+
- lmsys/lmsys-chat-1m
|
29 |
+
- CollectiveCognition/chats-data-2023-09-22
|
30 |
+
- CoT-Alpaca-GPT4
|
31 |
+
- WizardLM/WizardLM_evol_instruct_70k
|
32 |
+
- WizardLM/WizardLM_evol_instruct_V2_196k
|
33 |
+
- teknium/GPT4-LLM-Cleaned
|
34 |
+
- GPTeacher
|
35 |
+
- OpenGPT
|
36 |
+
- meta-math/MetaMathQA
|
37 |
+
- Open-Orca/SlimOrca
|
38 |
+
- garage-bAInd/Open-Platypus
|
39 |
+
- anon8231489123/ShareGPT_Vicuna_unfiltered
|
40 |
+
- Unnatural-Instructions-GPT4
|
41 |
+
---
|
42 |
+
|
43 |
+
# Quant Infos
|
44 |
+
|
45 |
+
- quants done with an importance matrix for improved quantization loss
|
46 |
+
- ggufs & imatrix generated from bf16 for "optimal" accuracy loss
|
47 |
+
- Wide coverage of different gguf quant types from Q\_8\_0 down to IQ1\_S
|
48 |
+
- Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [4bfe50f741479c1df1c377260c3ff5702586719e](https://github.com/ggerganov/llama.cpp/commit/4bfe50f741479c1df1c377260c3ff5702586719e) (master as of 2024-06-11)
|
49 |
+
- Imatrix generated with [this](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8) multi-purpose dataset by [bartowski](https://huggingface.co/bartowski).
|
50 |
+
```
|
51 |
+
./imatrix -c 512 -m $model_name-bf16.gguf -f calibration_datav3.txt -o $model_name.imatrix
|
52 |
+
```
|
53 |
+
|
54 |
+
# Original Model Card:
|
55 |
+
|
56 |
+
# Replete-Coder-Qwen2-1.5b
|
57 |
+
Finetuned by: Rombodawg
|
58 |
+
### More than just a coding model!
|
59 |
+
Although Replete-Coder has amazing coding capabilities, its trained on vaste amount of non-coding data, fully cleaned and uncensored. Dont just use it for coding, use it for all your needs! We are truly trying to make the GPT killer!
|
60 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/-0dERC793D9XeFsJ9uHbx.png)
|
61 |
+
|
62 |
+
Thank you to TensorDock for sponsoring Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b
|
63 |
+
you can check out their website for cloud compute rental bellow.
|
64 |
+
- https://tensordock.com
|
65 |
+
__________________________________________________________________________________________________
|
66 |
+
Replete-Coder-Qwen2-1.5b is a general purpose model that is specially trained in coding in over 100 coding languages. The data used to train the model contains 25% non-code instruction data and 75% coding instruction data totaling up to 3.9 million lines, roughly 1 billion tokens, or 7.27gb of instruct data. The data used to train this model was 100% uncensored, then fully deduplicated, before training happened.
|
67 |
+
|
68 |
+
The Replete-Coder models (including Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b) feature the following:
|
69 |
+
|
70 |
+
- Advanced coding capabilities in over 100 coding languages
|
71 |
+
- Advanced code translation (between languages)
|
72 |
+
- Security and vulnerability prevention related coding capabilities
|
73 |
+
- General purpose use
|
74 |
+
- Uncensored use
|
75 |
+
- Function calling
|
76 |
+
- Advanced math use
|
77 |
+
- Use on low end (8b) and mobile (1.5b) platforms
|
78 |
+
|
79 |
+
Notice: Replete-Coder series of models are fine-tuned on a context window of 8192 tokens. Performance past this context window is not guaranteed.
|
80 |
+
|
81 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/ADHZysQCKxiSordZRwuj_.png)
|
82 |
+
__________________________________________________________________________________________________
|
83 |
+
|
84 |
+
You can find the 25% non-coding instruction below:
|
85 |
+
|
86 |
+
- https://huggingface.co/datasets/Replete-AI/OpenHermes-2.5-Uncensored
|
87 |
+
|
88 |
+
And the 75% coding specific instruction data below:
|
89 |
+
|
90 |
+
- https://huggingface.co/datasets/Replete-AI/code_bagel
|
91 |
+
|
92 |
+
These two datasets were combined to create the final dataset for training, which is linked below:
|
93 |
+
|
94 |
+
- https://huggingface.co/datasets/Replete-AI/code_bagel_hermes-2.5
|
95 |
+
__________________________________________________________________________________________________
|
96 |
+
## Prompt Template: ChatML
|
97 |
+
```
|
98 |
+
<|im_start|>system
|
99 |
+
{}<|im_end|>
|
100 |
+
|
101 |
+
<|im_start|>user
|
102 |
+
{}<|im_end|>
|
103 |
+
|
104 |
+
<|im_start|>assistant
|
105 |
+
{}
|
106 |
+
```
|
107 |
+
Note: The system prompt varies in training data, but the most commonly used one is:
|
108 |
+
```
|
109 |
+
Below is an instruction that describes a task, Write a response that appropriately completes the request.
|
110 |
+
```
|
111 |
+
End token:
|
112 |
+
```
|
113 |
+
<|endoftext|>
|
114 |
+
```
|
115 |
+
__________________________________________________________________________________________________
|
116 |
+
Thank you to the community for your contributions to the Replete-AI/code_bagel_hermes-2.5 dataset. Without the participation of so many members making their datasets free and open source for any to use, this amazing AI model wouldn't be possible.
|
117 |
+
|
118 |
+
Extra special thanks to Teknium for the Open-Hermes-2.5 dataset and jondurbin for the bagel dataset and the naming idea for the code_bagel series of datasets. You can find both of their huggingface accounts linked below:
|
119 |
+
|
120 |
+
- https://huggingface.co/teknium
|
121 |
+
- https://huggingface.co/jondurbin
|
122 |
+
|
123 |
+
Another special thanks to unsloth for being the main method of training for Replete-Coder. Bellow you can find their github, as well as the special Replete-Ai secret sause (Unsloth + Qlora + Galore) colab code document that was used to train this model.
|
124 |
+
|
125 |
+
- https://github.com/unslothai/unsloth
|
126 |
+
- https://colab.research.google.com/drive/1eXGqy5M--0yW4u0uRnmNgBka-tDk2Li0?usp=sharing
|
127 |
+
__________________________________________________________________________________________________
|
128 |
+
|
129 |
+
## Join the Replete-Ai discord! We are a great and Loving community!
|
130 |
+
|
131 |
+
- https://discord.gg/ZZbnsmVnjD
|