Text Generation
Transformers
PyTorch
Safetensors
English
Chinese
llama
axolotl
Generated from Trainer
conversational
text-generation-inference
Inference Endpoints
flydust commited on
Commit
93ec240
Β·
verified Β·
1 Parent(s): b4cb452

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -66
README.md CHANGED
@@ -5,12 +5,112 @@ tags:
5
  - axolotl
6
  - generated_from_trainer
7
  model-index:
8
- - name: Llama-3-8B-Magpie-Mix-RC
9
  results: []
 
 
 
 
 
 
 
10
  ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
16
  <details><summary>See axolotl config</summary>
@@ -37,7 +137,7 @@ datasets:
37
  conversation: llama3
38
  dataset_prepared_path: last_run_prepared
39
  val_set_size: 0.001
40
- output_dir: /home/cc/axolotl/axolotl_out/Llama-3-8B-base-magpie-RC
41
 
42
  sequence_len: 8192
43
  sample_packing: true
@@ -47,7 +147,7 @@ pad_to_sequence_len: true
47
  wandb_project: SynDa
48
  wandb_entity:
49
  wandb_watch:
50
- wandb_name: Llama-3-8B-base-150KR-Llama3-Pro-MT-300K-C
51
  wandb_log_model:
52
  hub_model_id: Magpie-Align/Llama-3-8B-Magpie-Mix-RC
53
 
@@ -88,64 +188,3 @@ special_tokens:
88
  ```
89
 
90
  </details><br>
91
-
92
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/uw-nsl/SynDa/runs/tw9z4syg)
93
- # Llama-3-8B-Magpie-Mix-RC
94
-
95
- This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on the None dataset.
96
- It achieves the following results on the evaluation set:
97
- - Loss: 0.4611
98
-
99
- ## Model description
100
-
101
- More information needed
102
-
103
- ## Intended uses & limitations
104
-
105
- More information needed
106
-
107
- ## Training and evaluation data
108
-
109
- More information needed
110
-
111
- ## Training procedure
112
-
113
- ### Training hyperparameters
114
-
115
- The following hyperparameters were used during training:
116
- - learning_rate: 2e-05
117
- - train_batch_size: 1
118
- - eval_batch_size: 1
119
- - seed: 42
120
- - distributed_type: multi-GPU
121
- - num_devices: 4
122
- - gradient_accumulation_steps: 32
123
- - total_train_batch_size: 128
124
- - total_eval_batch_size: 4
125
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
126
- - lr_scheduler_type: cosine
127
- - lr_scheduler_warmup_steps: 98
128
- - num_epochs: 2
129
-
130
- ### Training results
131
-
132
- | Training Loss | Epoch | Step | Validation Loss |
133
- |:-------------:|:------:|:----:|:---------------:|
134
- | 0.8616 | 0.0019 | 1 | 0.8870 |
135
- | 0.5554 | 0.2013 | 106 | 0.5568 |
136
- | 0.5067 | 0.4027 | 212 | 0.5065 |
137
- | 0.4728 | 0.6040 | 318 | 0.4865 |
138
- | 0.4681 | 0.8054 | 424 | 0.4740 |
139
- | 0.4563 | 1.0067 | 530 | 0.4662 |
140
- | 0.4115 | 1.1944 | 636 | 0.4642 |
141
- | 0.3993 | 1.3957 | 742 | 0.4620 |
142
- | 0.4048 | 1.5971 | 848 | 0.4613 |
143
- | 0.4167 | 1.7984 | 954 | 0.4611 |
144
-
145
-
146
- ### Framework versions
147
-
148
- - Transformers 4.42.3
149
- - Pytorch 2.3.1+cu121
150
- - Datasets 2.19.1
151
- - Tokenizers 0.19.1
 
5
  - axolotl
6
  - generated_from_trainer
7
  model-index:
8
+ - name: Llama-3-8B-Magpie-Align-SFT-v1.0
9
  results: []
10
+ datasets:
11
+ - Magpie-Align/Magpie-Reasoning-150K
12
+ - Magpie-Align/Magpie-Pro-MT-300K-v0.1
13
+ - Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese
14
+ language:
15
+ - en
16
+ - zh
17
  ---
18
 
19
+ ![Magpie](https://cdn-uploads.huggingface.co/production/uploads/653df1323479e9ebbe3eb6cc/FWWILXrAGNwWr52aghV0S.png)
20
+
21
+ # 🐦 Llama-3-8B-Magpie-Align-SFT-v1.0
22
+
23
+ Project Web: [https://magpie-align.github.io/](https://magpie-align.github.io/)
24
+
25
+ Arxiv Technical Report: [https://arxiv.org/abs/2406.08464](https://arxiv.org/abs/2406.08464)
26
+
27
+ Codes: [https://github.com/magpie-align/magpie](https://github.com/magpie-align/magpie)
28
+
29
+ ## 🧐 About This Model
30
+
31
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) on
32
+ - [Magpie-Align/Magpie-Pro-MT-300K-v0.1](https://huggingface.co/datasets/Magpie-Align/Magpie-Pro-MT-300K-v0.1),
33
+ - [Magpie-Align/Magpie-Reasoning-150K](https://huggingface.co/datasets/Magpie-Align/Magpie-Reasoning-150K), and
34
+ - [Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese](https://huggingface.co/datasets/Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese)
35
+
36
+ Compared to [v0.2](https://huggingface.co/Magpie-Align/Llama-3-8B-Magpie-Align-SFT-v0.2), we enhance its multi-lingual ability by incorporating a new dataset with 200K Chinese instructions. It achieves performance comparable with the official Llama-3-8B-Instruct Model **with SFT only**! The detailed benchmark performance is as follows:
37
+
38
+ - **MT-Bench: 8.050 (1st Turn), 7.350 (Second Turn), 7.700 (Average)**
39
+ - **Alpaca Eval 2 (GPT-4-Turbo-1106): 26.37 (LC), 26.42 (WR)**
40
+ - **Alpaca Eval 2 (Llama-3-8B-Instruct): 54.53 (LC), 55.26 (WR)**
41
+ - **Arena Hard: 20.6**
42
+
43
+ ## πŸ‘€ Other Information
44
+
45
+ **License**: Please follow [Meta Llama 3 Community License](https://llama.meta.com/llama3/license).
46
+
47
+ **Conversation Template**: Please use Llama 3 **official chat template** for the best performance.
48
+
49
+ **How to use it?** Please check the official [Llama 3 repository](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct#how-to-use) for detailed instructions. Simply replace the original `model_id` with `Magpie-Align/Llama-3-8B-Magpie-Align-SFT-v1.0`.
50
+
51
+ ## πŸ“š Citation
52
+
53
+ If you find the model, data, or code useful, please cite our paper:
54
+ ```
55
+ @article{xu2024magpie,
56
+ title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing},
57
+ author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin},
58
+ year={2024},
59
+ eprint={2406.08464},
60
+ archivePrefix={arXiv},
61
+ primaryClass={cs.CL}
62
+ }
63
+ ```
64
+ **Questions?** Please contact [Zhangchen](https://zhangchenxu.com/) by email.
65
+
66
+ ## Paper Abstract
67
+ <details><summary>Click Here</summary>
68
+ High-quality instruction data is critical for aligning large language models (LLMs). Although some models, such as Llama-3-Instruct, have open weights, their alignment data remain private, which hinders the democratization of AI. High human labor costs and a limited, predefined scope for prompting prevent existing open-source data creation methods from scaling effectively, potentially limiting the diversity and quality of public alignment datasets. Is it possible to synthesize high-quality instruction data at scale by extracting it directly from an aligned LLM? We present a self-synthesis method for generating large-scale alignment data named Magpie. Our key observation is that aligned LLMs like Llama-3-Instruct can generate a user query when we input only the left-side templates up to the position reserved for user messages, thanks to their auto-regressive nature. We use this method to prompt Llama-3-Instruct and generate 4 million instructions along with their corresponding responses. We perform a comprehensive analysis of the extracted data and select 300K high-quality instances. To compare Magpie data with other public instruction datasets, we fine-tune Llama-3-8B-Base with each dataset and evaluate the performance of the fine-tuned models. Our results indicate that in some tasks, models fine-tuned with Magpie perform comparably to the official Llama-3-8B-Instruct, despite the latter being enhanced with 10 million data points through supervised fine-tuning (SFT) and subsequent feedback learning. We also show that using Magpie solely for SFT can surpass the performance of previous public datasets utilized for both SFT and preference optimization, such as direct preference optimization with UltraFeedback. This advantage is evident on alignment benchmarks such as AlpacaEval, ArenaHard, and WildBench.
69
+ </details><be>
70
+
71
+ ## πŸƒβ€β™‚οΈβ€βž‘οΈ Training procedure
72
+
73
+ ### Training hyperparameters
74
+
75
+ The following hyperparameters were used during training:
76
+ - learning_rate: 2e-05
77
+ - train_batch_size: 1
78
+ - eval_batch_size: 1
79
+ - seed: 42
80
+ - distributed_type: multi-GPU
81
+ - num_devices: 4
82
+ - gradient_accumulation_steps: 32
83
+ - total_train_batch_size: 128
84
+ - total_eval_batch_size: 4
85
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
86
+ - lr_scheduler_type: cosine
87
+ - lr_scheduler_warmup_steps: 98
88
+ - num_epochs: 2
89
+
90
+ ### Training results
91
+
92
+ | Training Loss | Epoch | Step | Validation Loss |
93
+ |:-------------:|:------:|:----:|:---------------:|
94
+ | 0.8616 | 0.0019 | 1 | 0.8870 |
95
+ | 0.5554 | 0.2013 | 106 | 0.5568 |
96
+ | 0.5067 | 0.4027 | 212 | 0.5065 |
97
+ | 0.4728 | 0.6040 | 318 | 0.4865 |
98
+ | 0.4681 | 0.8054 | 424 | 0.4740 |
99
+ | 0.4563 | 1.0067 | 530 | 0.4662 |
100
+ | 0.4115 | 1.1944 | 636 | 0.4642 |
101
+ | 0.3993 | 1.3957 | 742 | 0.4620 |
102
+ | 0.4048 | 1.5971 | 848 | 0.4613 |
103
+ | 0.4167 | 1.7984 | 954 | 0.4611 |
104
+
105
+
106
+ ### Framework versions
107
+
108
+ - Transformers 4.42.3
109
+ - Pytorch 2.3.1+cu121
110
+ - Datasets 2.19.1
111
+ - Tokenizers 0.19.1
112
+
113
+ *Internal name for identification: Llama-3-8B-Magpie-Mix-RC*. Please change the model name in the below Axolotl config.
114
 
115
  [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
116
  <details><summary>See axolotl config</summary>
 
137
  conversation: llama3
138
  dataset_prepared_path: last_run_prepared
139
  val_set_size: 0.001
140
+ output_dir: axolotl_out/Llama-3-8B-Magpie-Mix-RC
141
 
142
  sequence_len: 8192
143
  sample_packing: true
 
147
  wandb_project: SynDa
148
  wandb_entity:
149
  wandb_watch:
150
+ wandb_name: Llama-3-8B-Magpie-Mix-RC
151
  wandb_log_model:
152
  hub_model_id: Magpie-Align/Llama-3-8B-Magpie-Mix-RC
153
 
 
188
  ```
189
 
190
  </details><br>