Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,28 @@
|
|
1 |
-
---
|
2 |
-
license: llama3
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: llama3
|
3 |
+
---
|
4 |
+
# Llama3-Prime
|
5 |
+
|
6 |
+
This [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model is a merge of other pretrained Llama 3 language models that were optimized for user preference. As a result, this merged model should be strong at providing relevant answers to user queries. Here, usability is more important than beating benchmarks.
|
7 |
+
|
8 |
+
- Input: text only
|
9 |
+
- Output: text only
|
10 |
+
- Prompt format: Llama 3
|
11 |
+
- Language: English
|
12 |
+
|
13 |
+
This model was created by merging multiple models with equal weights through the use of [MergeKit's](https://github.com/arcee-ai/mergekit) `model_stock` method.
|
14 |
+
|
15 |
+
Base Model: [Daredevil-8B](https://huggingface.co/mlabonne/Daredevil-8B)
|
16 |
+
|
17 |
+
Models Used:
|
18 |
+
- [Llama-3-Instruct-8B-SimPO-ExPO](https://huggingface.co/chujiezheng/Llama-3-Instruct-8B-SimPO-ExPO)
|
19 |
+
- [Llama-3-8B-Magpie-Pro-SFT-v0.1](https://huggingface.co/Magpie-Align/Llama-3-8B-Magpie-Pro-SFT-v0.1)
|
20 |
+
- [SELM-Llama-3-8B-Instruct-iter-3](https://huggingface.co/ZhangShenao/SELM-Llama-3-8B-Instruct-iter-3)
|
21 |
+
- [LLaMA3-iterative-DPO-final-ExPO](https://huggingface.co/chujiezheng/LLaMA3-iterative-DPO-final-ExPO)
|
22 |
+
- [Llama-3-Instruct-8B-SPPO-Iter3](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3)
|
23 |
+
- [MAmmoTH2-8B-Plus](https://huggingface.co/TIGER-Lab/MAmmoTH2-8B-Plus)
|
24 |
+
- [Bagel-8b-v1.0](https://huggingface.co/jondurbin/bagel-8b-v1.0)
|
25 |
+
|
26 |
+
Training Details:
|
27 |
+
|
28 |
+
The merged model was trained using [LLaMA Factory](https://github.com/hiyouga/LLaMA-Factory) on the `alpaca_en_demo` dataset to ensure the model can respond in the Llama 3 Instruct format. The training parameters included a rank of 1, an alpha value of 1, and a 0.3 dropout rate. In other words, very weak training to prevent interfering with the merged model's capabilities.
|