Edit model card

DISCLAIMER: THIS PROBABLY DOESNT WORK

wizardphind-coder-passthrough-39B

wizardphind-coder-passthrough-39B is a merge of the following models using mergekit:

wizardphind-coder-passthrough-39B is an experimental model combining the deepseek-33B and codellama-34B models. I expect the model to become much better when trained further on coding specific tasks.

Since deepseek & the codellama models have different sized tensors for their MLP/Attention layers, this model will be initialized with empty layers and will need to be fine-tuned futher.

This model utilizes all the layers of the Wizard Coder 33B model and 8 layers from Phind's Codellama 34B model.

🧩 Configuration

```yaml slices:

  • sources:
    • model: WizardLM/WizardCoder-33B-V1.1 layer_range: [0, 62]
  • sources:
    • model: Phind/Phind-CodeLlama-34B-v2 layer_range: [24, 32] merge_method: passthrough dtype: bfloat16 ```
Downloads last month
5
Safetensors
Model size
38.9B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Merge of