ParagonLight commited on
Commit
16cec3e
1 Parent(s): 73024b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +181 -3
README.md CHANGED
@@ -1,3 +1,181 @@
1
- ---
2
- license: llama2
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ ---
4
+
5
+ # MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
6
+
7
+ This repository contains the models used in the [paper](https://arxiv.org/abs/2405.13053) "MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models".
8
+
9
+ The corresponding GitHub repository is [MeteoRA](https://github.com/ParagonLight/meteor-of-lora).
10
+
11
+ ![Evaluation Results](images/framework.png)
12
+
13
+ ## Overal performance
14
+
15
+ ### General performance of MeteoRA embeded LLMs with 28 LoRA adapters
16
+
17
+ We successfully apply MeteoRA to both LlaMA2-13B and LlaMA3-8B. Each model equips 28 tasks embedded in 28 LoRA adapters, respectively.
18
+ The performance of MeteoRA is comparable to the state-of-the-art. Refer to our paper for the detailed information of evaluation settings.
19
+
20
+ <!-- Evaluation results of models based on LlaMA2-13B:
21
+ ![Evaluation Results](images/llama2_13b_radar_graph_v3.png)
22
+
23
+ Evaluation results of models based on LlaMA3-8B:
24
+ ![Evaluation Results](images/llama3_8b_radar_graph_v3.png) -->
25
+
26
+ <table>
27
+ <tr>
28
+ <td><img src="images/llama2_13b_radar_graph_v3.png" alt="LlaMA2-13B" width="300"/></td>
29
+ <td><img src="images/llama3_8b_radar_graph_v3.png" alt="LlaMA3-8B" width="300"/></td>
30
+ </tr>
31
+ </table>
32
+ &nbsp;&nbsp;&nbsp;&nbsp;MeteoRA with LlaMA2-13B &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MeteoRA with LlaMA3-8B
33
+
34
+
35
+ ### Example of *composite-3* tasks
36
+ We highlight the statistically dominant LoRA selected by MeteoRA in token level (decoded to words). The result shows that LLM with MeteoRA could achieve timely LoRA switching on both phases of input understanding and output generation. The background color gets darker when Gating network assigns a higher weight value.
37
+
38
+ ![Evaluation Results](images/serial_3_short.png)
39
+ ## Directory structure
40
+
41
+ - `llama3_8b_lora_b`: Contains one LoRA adapter fine-tuned with 28 tasks together in balanced-dataset mode (1,000 samples for each task).
42
+ - `llama3_8b_lora_f`: Contains one LoRA adapter fine-tuned with 28 tasks together in full-dataset mode.
43
+ - `llama3_8b_meteora`: Contains the LlaMA3-8b base model equipped with MeteoRA. Both top-1 and top-2 versions included.
44
+ - `llama3_8b_peft`: Contains 28 LoRA adapters fine-tuned for 28 tasks, respectively.
45
+
46
+ ## Usage
47
+
48
+ ### Preparation
49
+
50
+ 0. Clone the GitHub repository [MeteoRA](https://github.com/ParagonLight/meteor-of-lora).
51
+
52
+ 1. Install necessary packages:
53
+ ```shell
54
+ pip install -r requirements.txt
55
+ ```
56
+
57
+ 2. Prepare the datasets. MeteoRA requires datasets in JSONL format. The tasks are primarily selected from the BIGBench dataset in the paper, which is in JSON format. To convert them to JSONL format, run:
58
+ ```shell
59
+ cd data
60
+ python create_dataset.py --task all
61
+ ```
62
+
63
+ To create a specific dataset, use:
64
+ ```shell
65
+ cd data
66
+ python create_dataset.py --task <task_name>
67
+ ```
68
+
69
+ 3. Prepare *composite-n* tasks. Refer to our paper for the definition of *composite-n* tasks. Generate these tasks using:
70
+ ```shell
71
+ python create_composite.py --n <n>
72
+ ```
73
+ We prepared `n=3`, `n=5` and `n=10` few-shot dataset generating code. Before generation, please ensure that the sub-tasks to composite *composite-n* task have been included in `data/datasets`.
74
+
75
+ 4. Prepare LoRA adapters and MeteoRA model checkpoints. You can train them yourself or download ours pre-trained models ([MeteoRA with LlaMA2](https://huggingface.co/ParagonLight/MeteoRA-llama2-13b) and [MeteoRA with LlaMA3](https://huggingface.co/ParagonLight/MeteoRA-llama3-8b) as base model):
76
+ ```shell
77
+ python download_ckpt.py
78
+ ```
79
+
80
+ 5. Update file paths in `configs/config.yaml`. Example paths:
81
+ ```yaml
82
+ base_model_path: 'meta-llama3/Meta-Llama-3-8B'
83
+ meteora_ckpt_path: 'ckpt/llama3_8b/llama3_8b_meteora/top_2'
84
+ adapter_dir: 'ckpt/llama3_8b/llama3_8b_peft'
85
+ ```
86
+
87
+ ### Evaluation
88
+
89
+ Run a benchmark with the MeteoRA model:
90
+ ```shell
91
+ python eval_model.py --task <task_name> --batch_size <batch_size>
92
+ ```
93
+
94
+ For example:
95
+ ```shell
96
+ python eval_model.py --task composite_10 --batch_size 4
97
+ ```
98
+
99
+ **Note:** For *composite-n* tasks, set a larger *temperature* value (`self.T` in `MoELoRA/layer.py`). Use `15`, `20`, and `30` for `n=3`, `n=5`, and `n=10`, respectively. For single tasks, use the default value (`self.T=1`).
100
+
101
+
102
+ To save the evaluation result:
103
+ ```shell
104
+ python eval_model.py --task <task_name> --batch_size <batch_size> --save
105
+ ```
106
+
107
+ For debug mode (model output and ground truth will be shown in the console):
108
+ ```shell
109
+ python eval_model.py --task <task_name> --batch_size <batch_size> --debug
110
+ ```
111
+
112
+ Run a benchmark with the PEFT model:
113
+ ```shell
114
+ python eval_model.py --task <task_name> --batch_size <batch_size> --model <adapter_name>
115
+ ```
116
+
117
+ ### Training the MeteoRA Model
118
+
119
+ 0. Prepare LoRA adapters and corresponding datasets in JSONL format. Ensure each LoRA adapter has a corresponding dataset. Place all LoRA adapters and datasets in their respective folders with matching subfolder names:
120
+ ```
121
+ - lora_adapters
122
+ - adapter_name1
123
+ - adapter_name2
124
+ - ...
125
+ - datasets
126
+ - dataset_name1
127
+ - dataset_name2
128
+ - ...
129
+ ```
130
+
131
+ 1. Update file paths in `run_meteora_train_fsdp.sh`.
132
+
133
+
134
+ 2. Train the MeteoRA model:
135
+ ```shell
136
+ sh run_meteora_train_fsdp.sh
137
+ ```
138
+
139
+ **Note:** The current version of Triton acceleration supports inference mode only. Use the following settings when training the MeteoRA model:
140
+
141
+ ```shell
142
+ export MOELINEAR_USE_ACCELERATE_FWD=0
143
+ export MOELINEAR_FWD_INNER_LOOP_MODE='batch'
144
+ export MOELINEAR_ACCELERATE_FWD_BACKEND='torch'
145
+ export MOELINEAR_ACCELERATE_FWD_BACKEND_TORCH_VERSION='v1'
146
+ ```
147
+
148
+
149
+ ### Evaluation Results
150
+
151
+ #### *composite-n* results
152
+
153
+ The *composite-10* evaluation results are presented in details with MeteoRA results on the left side and LoRA-B results on the right side of each metric column. A dash ('-') indicates that the corresponding metric was not applicable or included in the evaluation. Note that the `0.00` BLEU scores are caused by mismatch and too insufficient answers.
154
+
155
+ | Sub-task Name | Accuracy↑ (MeteoRA) | Accuracy↑ (LoRA-B) | BLEU↑ (MeteoRA) | BLEU↑ (LoRA-B) | ROUGE-1↑ (MeteoRA) | ROUGE-1↑ (LoRA-B) | ROUGE-2↑ (MeteoRA) | ROUGE-2↑ (LoRA-B) | ROUGE-L↑ (MeteoRA) | ROUGE-L↑ (LoRA-B) |
156
+ |--------------------------------|---------------------|--------------------|-----------------|----------------|---------------------|--------------------|---------------------|--------------------|---------------------|--------------------|
157
+ | logical_deduction | 0.500↑ | 0.453 | - | - | - | - | - | - | - | - |
158
+ | question_selection | 0.703↑ | 0.688 | - | - | - | - | - | - | - | - |
159
+ | abstract_narrative_understanding| 0.625↓ | 0.672 | - | - | - | - | - | - | - | - |
160
+ | goal_step_wikihow | 0.773↑ | 0.727 | - | - | - | - | - | - | - | - |
161
+ | winowhy | 0.422↑ | 0.078 | - | - | - | - | - | - | - | - |
162
+ | strategyqa | 0.461↑ | 0.211 | 3.23↑ | 0.00 | 0.225↑ | 0.106 | 0.051↑ | 0.025 | 0.210↑ | 0.099 |
163
+ | disfl_qa | 0.266↑ | 0.117 | - | - | - | - | - | - | - | - |
164
+ | news_commentary_de | - | - | 14.78↑ | 14.54 | - | - | - | - | - | - |
165
+ | alpaca | - | - | 0.00↓ | 8.17 | 0.257↑ | 0.187 | 0.075 | 0.075 | 0.241↑ | 0.167 |
166
+ | linguistics_puzzles | - | - | 17.37↑ | 12.14 | 0.233↑ | 0.189 | 0.052↑ | 0.030 | 0.176↑ | 0.103 |
167
+
168
+
169
+
170
+ ## Citation
171
+
172
+ If you use MeteoRA for your research, please cite our [paper](https://arxiv.org/abs/2405.13053):
173
+ ```bibtex
174
+ @misc{xu2024meteora,
175
+ title={MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models},
176
+ author={Jingwei Xu and Junyu Lai and Yunpeng Huang},
177
+ year={2024},
178
+ eprint={2405.13053},
179
+ archivePrefix={arXiv},
180
+ }
181
+ ```