Update README.md
Browse files
README.md
CHANGED
@@ -3,3 +3,30 @@ license: other
|
|
3 |
license_name: deepseek
|
4 |
license_link: https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/LICENSE-MODEL
|
5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
license_name: deepseek
|
4 |
license_link: https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/LICENSE-MODEL
|
5 |
---
|
6 |
+
DeepMagic-Coder-7b
|
7 |
+
|
8 |
+
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/LlbswwXZQoIQziTNEMSMk.jpeg)
|
9 |
+
|
10 |
+
This is an extremely successful merge of the deepseek-coder-6.7b-instruct and Magicoder-S-DS-6.7B models, bringing an uplift in overall coding performance without any comprimise to the models integrity (at least with limited testing).
|
11 |
+
|
12 |
+
This is the first of my models to use the merge-kits *task_arithmetic* merging method. The method is detailed bellow, and I clearly very usefull for merging ai models that were fine-tuned from a common base:
|
13 |
+
```
|
14 |
+
Computes "task vectors" for each model by subtracting a base model. Merges the task vectors linearly and adds back the base. Works great for models that were fine tuned from a common ancestor. Also a super useful mental framework for several of the more involved merge methods.
|
15 |
+
```
|
16 |
+
|
17 |
+
The Merge was created using Mergekit and the paremeters can be found bellow:
|
18 |
+
```yaml
|
19 |
+
models:
|
20 |
+
- model: deepseek-ai_deepseek-coder-6.7b-instruct
|
21 |
+
parameters:
|
22 |
+
weight: 1
|
23 |
+
- model: ise-uiuc_Magicoder-S-DS-6.7B
|
24 |
+
parameters:
|
25 |
+
weight: 1
|
26 |
+
merge_method: task_arithmetic
|
27 |
+
base_model: ise-uiuc_Magicoder-S-DS-6.7B
|
28 |
+
parameters:
|
29 |
+
normalize: true
|
30 |
+
int8_mask: true
|
31 |
+
dtype: float16
|
32 |
+
```
|