SanjiWatsuki
commited on
Commit
•
2631dac
1
Parent(s):
c443391
Update README.md
Browse files
README.md
CHANGED
@@ -13,7 +13,7 @@ For 7B models, we can't drop as many of the parameters and retain the model's st
|
|
13 |
This is an experiment utilizing two merger techniques together to try and transfer skills between finetuned models. If we were to DARE TIE a low density merger onto the base Mistral model and then task arithmetic merge those low density delta weights onto a finetune, could we still achieve skill transfer?
|
14 |
|
15 |
```
|
16 |
-
models:
|
17 |
- model: mistralai/Mistral-7B-v0.1
|
18 |
# no parameters necessary for base model
|
19 |
- model: WizardLM/WizardMath-7B-V1.1
|
@@ -30,7 +30,7 @@ dtype: bfloat16
|
|
30 |
merge_method: task_arithmetic
|
31 |
base_model: mistralai/Mistral-7B-v0.1
|
32 |
models:
|
33 |
-
- model:
|
34 |
- model: Intel/neural-chat-7b-v3-3
|
35 |
parameters:
|
36 |
weight: 1.0
|
|
|
13 |
This is an experiment utilizing two merger techniques together to try and transfer skills between finetuned models. If we were to DARE TIE a low density merger onto the base Mistral model and then task arithmetic merge those low density delta weights onto a finetune, could we still achieve skill transfer?
|
14 |
|
15 |
```
|
16 |
+
models: # mistral-wizardmath-dare-0.7-density
|
17 |
- model: mistralai/Mistral-7B-v0.1
|
18 |
# no parameters necessary for base model
|
19 |
- model: WizardLM/WizardMath-7B-V1.1
|
|
|
30 |
merge_method: task_arithmetic
|
31 |
base_model: mistralai/Mistral-7B-v0.1
|
32 |
models:
|
33 |
+
- model: mistral-wizardmath-dare-0.7-density
|
34 |
- model: Intel/neural-chat-7b-v3-3
|
35 |
parameters:
|
36 |
weight: 1.0
|