laurentiubp
commited on
Commit
•
2d46a68
1
Parent(s):
9df5920
Update README.md
Browse files
README.md
CHANGED
@@ -1,14 +1,12 @@
|
|
1 |
---
|
2 |
license: llama3
|
3 |
-
base_model:
|
4 |
-
- catallama/CataLlama-v0.2-Instruct-SFT
|
5 |
-
- catallama/CataLlama-v0.2-Instruct-DPO
|
6 |
tags:
|
7 |
- llama
|
8 |
- llama-3
|
9 |
- catalan
|
10 |
model-index:
|
11 |
-
- name: CataLlama-v0.2-Instruct-SFT-DPO-Merged
|
12 |
results: []
|
13 |
datasets:
|
14 |
- catallama/Catalan-DPO-V2
|
@@ -23,9 +21,7 @@ pipeline_tag: text-generation
|
|
23 |
|
24 |
# CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF
|
25 |
|
26 |
-
**CataLlama-v0.2-Instruct-SFT-DPO-Merged** is a
|
27 |
-
|
28 |
-
The resulting model scores better than it's parents on both MMLU and GSM8K.
|
29 |
|
30 |
**This is an instruction fine-tuned model, optimised with DPO, proficient on the following tasks in Catalan**
|
31 |
|
@@ -43,7 +39,7 @@ The resulting model scores better than it's parents on both MMLU and GSM8K.
|
|
43 |
**License** The model uses the llama-3 license available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
|
44 |
|
45 |
|
46 |
-
## Benchmarks
|
47 |
|
48 |
| Model | CataLlama-v0.2-Instruct-DPO | CataLlama-v0.2-Instruct-SFT | CataLlama-v0.2-Instruct-SFT-DPO-Merged |
|
49 |
| ------------------ | --------------------------- | ------------------------------- | ------------------------------------------ |
|
@@ -51,14 +47,6 @@ The resulting model scores better than it's parents on both MMLU and GSM8K.
|
|
51 |
| GSM8K CoT 8 shot | 60.05 | 76.04 | **77.26** |
|
52 |
|
53 |
|
54 |
-
## Merging procedure
|
55 |
-
|
56 |
-
The merge was performed between the 32 layers of the two models, excluding the embedding, norm and the head layers.
|
57 |
-
|
58 |
-
The weights of the 32 layers were merged in equal proportion simply by calculating the average of the corresponding weights from the parent models.
|
59 |
-
|
60 |
-
The embedding, norm and head layers are copied from CataLlama-v0.2-Instruct-DPO without modification.
|
61 |
-
|
62 |
**This was done with a custom script, without mergekit.**
|
63 |
|
64 |
|
|
|
1 |
---
|
2 |
license: llama3
|
3 |
+
base_model: catallama/CataLlama-v0.2-Instruct-SFT-DPO-Merged
|
|
|
|
|
4 |
tags:
|
5 |
- llama
|
6 |
- llama-3
|
7 |
- catalan
|
8 |
model-index:
|
9 |
+
- name: CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF
|
10 |
results: []
|
11 |
datasets:
|
12 |
- catallama/Catalan-DPO-V2
|
|
|
21 |
|
22 |
# CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF
|
23 |
|
24 |
+
**CataLlama-v0.2-Instruct-SFT-DPO-Merged-GGUF** is a quantisation of [catallama/CataLlama-v0.2-Instruct-SFT-DPO-Merged](https://huggingface.co/catallama/CataLlama-v0.2-Instruct-SFT-DPO-Merged)
|
|
|
|
|
25 |
|
26 |
**This is an instruction fine-tuned model, optimised with DPO, proficient on the following tasks in Catalan**
|
27 |
|
|
|
39 |
**License** The model uses the llama-3 license available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
|
40 |
|
41 |
|
42 |
+
## Benchmarks (for the bf16 model)
|
43 |
|
44 |
| Model | CataLlama-v0.2-Instruct-DPO | CataLlama-v0.2-Instruct-SFT | CataLlama-v0.2-Instruct-SFT-DPO-Merged |
|
45 |
| ------------------ | --------------------------- | ------------------------------- | ------------------------------------------ |
|
|
|
47 |
| GSM8K CoT 8 shot | 60.05 | 76.04 | **77.26** |
|
48 |
|
49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
**This was done with a custom script, without mergekit.**
|
51 |
|
52 |
|