Edit model card

CodeBooga-34B-v0.1

This is a merge between the following two models:

  1. Phind-CodeLlama-34B-v2
  2. WizardCoder-Python-34B-V1.0

It was created with the BlockMerge Gradient script, the same one that was used to create MythoMax-L2-13b, and with the same settings. The following YAML was used:

model_path1: "Phind_Phind-CodeLlama-34B-v2_safetensors"
model_path2: "WizardLM_WizardCoder-Python-34B-V1.0_safetensors"
output_model_path: "CodeBooga-34B-v0.1"
operations:
  - operation: lm_head # Single tensor
    filter: "lm_head"
    gradient_values: [0.75]
  - operation: embed_tokens # Single tensor
    filter: "embed_tokens"
    gradient_values: [0.75]
  - operation: self_attn
    filter: "self_attn"
    gradient_values: [0.75, 0.25]
  - operation: mlp
    filter: "mlp"
    gradient_values: [0.25, 0.75]
  - operation: layernorm
    filter: "layernorm"
    gradient_values: [0.5, 0.5]
  - operation: modelnorm # Single tensor
    filter: "model.norm"
    gradient_values: [0.75]

Prompt format

Both base models use the Alpaca format, so it should be used for this one as well.

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
Your instruction

### Response:
Bot reply

### Instruction:
Another instruction

### Response:
Bot reply

Evaluation

(This is not very scientific, so bear with me.)

I made a quick experiment where I asked a set of 3 Python and 3 Javascript questions (real world, difficult questions with nuance) to the following models:

  1. This one
  2. A second variant generated with model_path1 and model_path2 swapped in the YAML above, which I called CodeBooga-Reversed-34B-v0.1
  3. WizardCoder-Python-34B-V1.0
  4. Phind-CodeLlama-34B-v2

Specifically, I used 4.250b EXL2 quantizations of each. I then sorted the responses for each question by quality, and attributed the following scores:

  • 4th place: 0
  • 3rd place: 1
  • 2nd place: 2
  • 1st place: 4

The resulting cumulative scores were:

  • CodeBooga-34B-v0.1: 22
  • WizardCoder-Python-34B-V1.0: 12
  • Phind-CodeLlama-34B-v2: 7
  • CodeBooga-Reversed-34B-v0.1: 1

CodeBooga-34B-v0.1 performed very well, while its variant performed poorly, so I uploaded the former but not the latter.

Quantized versions

GGUF

TheBloke has kindly provided GGUF quantizations for llama.cpp:

https://huggingface.co/TheBloke/CodeBooga-34B-v0.1-GGUF

Downloads last month
2,815
Safetensors
Model size
33.7B params
Tensor type
FP16
Β·
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Spaces using oobabooga/CodeBooga-34B-v0.1 5