VishaalY
/

wizard-phind-coder-passthrough-39B

Text Generation

WizardLM/WizardCoder-33B-V1.1

Phind/Phind-CodeLlama-34B-v2

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

wizard-phind-coder-passthrough-39B / README.md

VishaalY's picture

Update README.md

38e9ba0 verified 7 months ago

|

No virus

1.28 kB

	---
	license: apache-2.0
	tags:
	- merge
	- mergekit
	- lazymergekit
	- WizardLM/WizardCoder-33B-V1.1
	- Phind/Phind-CodeLlama-34B-v2
	---

	### DISCLAIMER: THIS PROBABLY DOESNT WORK

	# wizardphind-coder-passthrough-39B

	wizardphind-coder-passthrough-39B is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
	* [WizardLM/WizardCoder-33B-V1.1](https://huggingface.co/WizardLM/WizardCoder-33B-V1.1)
	* [Phind/Phind-CodeLlama-34B-v2](https://huggingface.co/Phind/Phind-CodeLlama-34B-v2)


	wizardphind-coder-passthrough-39B is an experimental model combining the deepseek-33B and codellama-34B models.
	I expect the model to become much better when trained further on coding specific tasks.

	Since deepseek & the codellama models have different sized tensors for their MLP/Attention layers,
	this model will be initialized with empty layers and will need to be fine-tuned futher.

	This model utilizes all the layers of the Wizard Coder 33B model and the 8 layers from Phind's Codellama 34B model.


	## 🧩 Configuration

	\```yaml
	slices:
	- sources:
	- model: WizardLM/WizardCoder-33B-V1.1
	layer_range: [0, 62]
	- sources:
	- model: Phind/Phind-CodeLlama-34B-v2
	layer_range: [24, 32]
	merge_method: passthrough
	dtype: bfloat16
	\```