SanjiWatsuki
/

neural-chat-7b-v3-3-wizardmath-dare-me

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SanjiWatsuki commited on Dec 23, 2023

Commit

2631dac

•

1 Parent(s): c443391

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ For 7B models, we can't drop as many of the parameters and retain the model's st
 This is an experiment utilizing two merger techniques together to try and transfer skills between finetuned models. If we were to DARE TIE a low density merger onto the base Mistral model and then task arithmetic merge those low density delta weights onto a finetune, could we still achieve skill transfer?
 ```
-models:
   - model: mistralai/Mistral-7B-v0.1
     # no parameters necessary for base model
   - model: WizardLM/WizardMath-7B-V1.1
@@ -30,7 +30,7 @@ dtype: bfloat16
 merge_method: task_arithmetic
 base_model: mistralai/Mistral-7B-v0.1
 models:
-  - model: C:\Users\sanji\Documents\Apps\text-generation-webui-main\models\mistral-wizardmath-dare-0.7
   - model: Intel/neural-chat-7b-v3-3
 parameters:
   weight: 1.0

 This is an experiment utilizing two merger techniques together to try and transfer skills between finetuned models. If we were to DARE TIE a low density merger onto the base Mistral model and then task arithmetic merge those low density delta weights onto a finetune, could we still achieve skill transfer?
 ```
+models: # mistral-wizardmath-dare-0.7-density
   - model: mistralai/Mistral-7B-v0.1
     # no parameters necessary for base model
   - model: WizardLM/WizardMath-7B-V1.1
 merge_method: task_arithmetic
 base_model: mistralai/Mistral-7B-v0.1
 models:
+  - model: mistral-wizardmath-dare-0.7-density
   - model: Intel/neural-chat-7b-v3-3
 parameters:
   weight: 1.0