catallama
/

CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF

@@ -1,14 +1,12 @@
 ---
 license: llama3
-base_model:
-- catallama/CataLlama-v0.2-Instruct-SFT
-- catallama/CataLlama-v0.2-Instruct-DPO
 tags:
 - llama
 - llama-3
 - catalan
 model-index:
-- name: CataLlama-v0.2-Instruct-SFT-DPO-Merged
   results: []
 datasets:
 - catallama/Catalan-DPO-V2
@@ -23,9 +21,7 @@ pipeline_tag: text-generation
 # CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF
-**CataLlama-v0.2-Instruct-SFT-DPO-Merged** is a merge between [catallama/CataLlama-v0.2-Instruct-SFT](https://huggingface.co/catallama/CataLlama-v0.2-Instruct-SFT) and [catallama/CataLlama-v0.2-Instruct-DPO](https://huggingface.co/catallama/CataLlama-v0.2-Instruct-DPO)
-The resulting model scores better than it's parents on both MMLU and GSM8K.
 **This is an instruction fine-tuned model, optimised with DPO, proficient on the following tasks in Catalan**
@@ -43,7 +39,7 @@ The resulting model scores better than it's parents on both MMLU and GSM8K.
 **License** The model uses the llama-3 license available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
-## Benchmarks
 | Model              | CataLlama-v0.2-Instruct-DPO | CataLlama-v0.2-Instruct-SFT     | CataLlama-v0.2-Instruct-SFT-DPO-Merged     |
 | ------------------ | --------------------------- | ------------------------------- | ------------------------------------------ |
@@ -51,14 +47,6 @@ The resulting model scores better than it's parents on both MMLU and GSM8K.
 | GSM8K CoT 8 shot   | 60.05                       | 76.04                           | **77.26**                                  |
-## Merging procedure
-The merge was performed between the 32 layers of the two models, excluding the embedding, norm and the head layers.
-The weights of the 32 layers were merged in equal proportion simply by calculating the average of the corresponding weights from the parent models.
-The embedding, norm and head layers are copied from CataLlama-v0.2-Instruct-DPO without modification.
 **This was done with a custom script, without mergekit.**

 ---
 license: llama3
+base_model: catallama/CataLlama-v0.2-Instruct-SFT-DPO-Merged
 tags:
 - llama
 - llama-3
 - catalan
 model-index:
+- name: CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF
   results: []
 datasets:
 - catallama/Catalan-DPO-V2
 # CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF
+**CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF** is a quantisation of [catallama/CataLlama-v0.2-Instruct-SFT-DPO-Merged](https://huggingface.co/catallama/CataLlama-v0.2-Instruct-SFT-DPO-Merged)
 **This is an instruction fine-tuned model, optimised with DPO, proficient on the following tasks in Catalan**
 **License** The model uses the llama-3 license available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
+## Benchmarks (for the bf16 model)
 | Model              | CataLlama-v0.2-Instruct-DPO | CataLlama-v0.2-Instruct-SFT     | CataLlama-v0.2-Instruct-SFT-DPO-Merged     |
 | ------------------ | --------------------------- | ------------------------------- | ------------------------------------------ |
 | GSM8K CoT 8 shot   | 60.05                       | 76.04                           | **77.26**                                  |
 **This was done with a custom script, without mergekit.**