File size: 1,346 Bytes
09174c3 6cac118 09174c3 2e63d36 0b83657 a35b4e7 09174c3 d4296df 09174c3 d4296df 0b83657 d3c217a d4296df 04b3079 38e9ba0 d4296df 09174c3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
---
license: apache-2.0
tags:
- merge
- mergekit
- lazymergekit
- WizardLM/WizardCoder-33B-V1.1
- Phind/Phind-CodeLlama-34B-v2
base_model:
- WizardLM/WizardCoder-33B-V1.1
- Phind/Phind-CodeLlama-34B-v2
---
### DISCLAIMER: THIS PROBABLY DOESNT WORK
# wizardphind-coder-passthrough-39B
wizardphind-coder-passthrough-39B is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
* [WizardLM/WizardCoder-33B-V1.1](https://huggingface.co/WizardLM/WizardCoder-33B-V1.1)
* [Phind/Phind-CodeLlama-34B-v2](https://huggingface.co/Phind/Phind-CodeLlama-34B-v2)
wizardphind-coder-passthrough-39B is an experimental model combining the deepseek-33B and codellama-34B models.
I expect the model to become much better when trained further on coding specific tasks.
Since deepseek & the codellama models have different sized tensors for their MLP/Attention layers,
this model will be initialized with empty layers and will need to be fine-tuned futher.
This model utilizes all the layers of the Wizard Coder 33B model and 8 layers from Phind's Codellama 34B model.
## 🧩 Configuration
\```yaml
slices:
- sources:
- model: WizardLM/WizardCoder-33B-V1.1
layer_range: [0, 62]
- sources:
- model: Phind/Phind-CodeLlama-34B-v2
layer_range: [24, 32]
merge_method: passthrough
dtype: bfloat16
\``` |