--- license: apache-2.0 tags: - merge - mergekit - lazymergekit - WizardLM/WizardCoder-33B-V1.1 - Phind/Phind-CodeLlama-34B-v2 base_model: - WizardLM/WizardCoder-33B-V1.1 - Phind/Phind-CodeLlama-34B-v2 --- ### DISCLAIMER: THIS PROBABLY DOESNT WORK # wizardphind-coder-passthrough-39B wizardphind-coder-passthrough-39B is a merge of the following models using [mergekit](https://github.com/cg123/mergekit): * [WizardLM/WizardCoder-33B-V1.1](https://huggingface.co/WizardLM/WizardCoder-33B-V1.1) * [Phind/Phind-CodeLlama-34B-v2](https://huggingface.co/Phind/Phind-CodeLlama-34B-v2) wizardphind-coder-passthrough-39B is an experimental model combining the deepseek-33B and codellama-34B models. I expect the model to become much better when trained further on coding specific tasks. Since deepseek & the codellama models have different sized tensors for their MLP/Attention layers, this model will be initialized with empty layers and will need to be fine-tuned futher. This model utilizes all the layers of the Wizard Coder 33B model and 8 layers from Phind's Codellama 34B model. ## 🧩 Configuration \```yaml slices: - sources: - model: WizardLM/WizardCoder-33B-V1.1 layer_range: [0, 62] - sources: - model: Phind/Phind-CodeLlama-34B-v2 layer_range: [24, 32] merge_method: passthrough dtype: bfloat16 \```