mikecovlee commited on
Commit
674303e
1 Parent(s): f13fd28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -11
README.md CHANGED
@@ -7,7 +7,7 @@ datasets:
7
 
8
  <div align="left"><img src="https://huggingface.co/scu-kdde/alpaca-mixlora-7b/resolve/main/MixLoRA.png" width=60%"></div>
9
 
10
- GitHub: https://github.com/TUDB-Labs/multi-lora-fine-tune
11
 
12
  The fundamental concept of MixLoRA is based on a pre-trained model with all parameters frozen, such as LLaMA-7B. It involves training multiple LoRA expert modules on top of its fully connected layer (FFN). Simultaneously, a routing layer (Gate Linear) is trained, creating a more powerful Mixture of Experts (MoE) language model. Theoretically, this approach allows achieving performance similar to existing MoE models but with fewer resources.
13
 
@@ -15,15 +15,74 @@ In addition, MixLoRA also allows simultaneous fine-tuning of the attention layer
15
 
16
  MixLoRA exists within m-LoRA in a specific adapter form. Consequently, m-LoRA is capable of simultaneously loading, training, and fine-tuning multiple distinct MixLoRA models. However, it's essential to note that these models must be based on the same pre-trained model.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ## Create MixLoRA model
19
 
20
  Basic command for creating a baseline model on the [Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned) dataset:
21
  ```bash
22
- python mlora.py \
23
  --base_model yahma/llama-7b-hf \
24
  --config ./config/alpaca_mixlora.json \
25
- --load_8bit \
26
- --mixlora
27
  ```
28
  Please note that once the MixLoRA model is created, the number of experts in the model cannot be changed.
29
 
@@ -32,25 +91,39 @@ Please note that once the MixLoRA model is created, the number of experts in the
32
  The MixLoRA model can also undergo further fine-tuning.
33
  Basic command for finetuning a model on the [Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned) dataset:
34
  ```bash
35
- python mlora.py \
36
  --base_model yahma/llama-7b-hf \
37
  --config ./config/alpaca_mixlora.json \
38
- --load_adapter \
39
  --load_8bit \
40
- --mixlora
41
  ```
42
 
43
  ## Evaluate MixLoRA model
44
 
45
  Currently, MixLoRA supports evaluation only through the m-LoRA framework.
46
  ```bash
47
- python mlora.py \
48
  --base_model yahma/llama-7b-hf \
49
  --config ./config/alpaca_mixlora.json \
50
- --load_adapter \
51
  --load_8bit \
52
- --inference \
53
- --mixlora
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
  ```
55
 
56
  ## Citation
 
7
 
8
  <div align="left"><img src="https://huggingface.co/scu-kdde/alpaca-mixlora-7b/resolve/main/MixLoRA.png" width=60%"></div>
9
 
10
+ GitHub: https://github.com/mikecovlee/mlora
11
 
12
  The fundamental concept of MixLoRA is based on a pre-trained model with all parameters frozen, such as LLaMA-7B. It involves training multiple LoRA expert modules on top of its fully connected layer (FFN). Simultaneously, a routing layer (Gate Linear) is trained, creating a more powerful Mixture of Experts (MoE) language model. Theoretically, this approach allows achieving performance similar to existing MoE models but with fewer resources.
13
 
 
15
 
16
  MixLoRA exists within m-LoRA in a specific adapter form. Consequently, m-LoRA is capable of simultaneously loading, training, and fine-tuning multiple distinct MixLoRA models. However, it's essential to note that these models must be based on the same pre-trained model.
17
 
18
+ ## Configuration of MixLoRA
19
+
20
+ Compared with LoRA, MixLoRA have some additional configurations.
21
+ ```json
22
+ {
23
+ "name": "lora_0",
24
+ "optim": "adamw",
25
+ "lr": 1e-5,
26
+ "batch_size": 16,
27
+ "micro_batch_size": 2,
28
+ "test_batch_size": 64,
29
+ "num_epochs": 3,
30
+ "r": 8,
31
+ "lora_alpha": 16,
32
+ "lora_dropout": 0.05,
33
+ "target_modules": {
34
+ "q_proj": true,
35
+ "k_proj": false,
36
+ "v_proj": true,
37
+ "o_proj": false,
38
+ "w1_proj": false,
39
+ "w2_proj": false,
40
+ "w3_proj": false
41
+ },
42
+ "data": "yahma/alpaca-cleaned",
43
+ "prompt": "template/alpaca.json",
44
+ "group_by_length": false,
45
+ "expand_side": "right"
46
+ }
47
+ ```
48
+ This is an example of LoRA training configuration. You can find instructions at [README.md](./README.md).
49
+
50
+ MixLoRA have two routing strategies: top-k routing (like *Mixtral*) and top-1 switch routing (like *Switch Transformers*), can be configured with `"routing_strategy": "mixtral"` or `"routing_strategy": "switch"`.
51
+
52
+ **Top-k Routing**
53
+ ```
54
+ {
55
+ ...
56
+ "routing_strategy": "mixtral",
57
+ "num_experts": 8,
58
+ "act_fn": "silu",
59
+ "top_k": 2,
60
+ ...
61
+ }
62
+ ```
63
+
64
+ **Top-1 Switch Routing**
65
+ ```
66
+ {
67
+ ...
68
+ "routing_strategy": "switch",
69
+ "num_experts": 8,
70
+ "act_fn": "gelu_new",
71
+ "expert_capacity": 32,
72
+ "jitter_noise": 0.1,
73
+ "ffn_dropout": 0.1,
74
+ ...
75
+ }
76
+ ```
77
+ You can add these items into training configurations to enable the MixLoRA architecture.
78
  ## Create MixLoRA model
79
 
80
  Basic command for creating a baseline model on the [Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned) dataset:
81
  ```bash
82
+ CUDA_VISIBLE_DEVICES=0 python mlora.py \
83
  --base_model yahma/llama-7b-hf \
84
  --config ./config/alpaca_mixlora.json \
85
+ --load_8bit
 
86
  ```
87
  Please note that once the MixLoRA model is created, the number of experts in the model cannot be changed.
88
 
 
91
  The MixLoRA model can also undergo further fine-tuning.
92
  Basic command for finetuning a model on the [Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned) dataset:
93
  ```bash
94
+ CUDA_VISIBLE_DEVICES=0 python mlora.py \
95
  --base_model yahma/llama-7b-hf \
96
  --config ./config/alpaca_mixlora.json \
 
97
  --load_8bit \
98
+ --load_adapter
99
  ```
100
 
101
  ## Evaluate MixLoRA model
102
 
103
  Currently, MixLoRA supports evaluation only through the m-LoRA framework.
104
  ```bash
105
+ CUDA_VISIBLE_DEVICES=0 python mlora.py \
106
  --base_model yahma/llama-7b-hf \
107
  --config ./config/alpaca_mixlora.json \
 
108
  --load_8bit \
109
+ --inference
110
+ ```
111
+ This apporach allows inference multiple MixLoRA and LoRA adapters simultaneously. We also provide a WebUI and an example for inference.
112
+ ```bash
113
+ # Run WebUI of Inference
114
+ CUDA_VISIBLE_DEVICES=0 python inference.py \
115
+ --base_model yahma/llama-7b-hf \
116
+ --lora_weights scu-kdde/alpaca-mixlora-7b \
117
+ --template template/alpaca.json \
118
+ --load_8bit
119
+
120
+ # Simply Generate
121
+ CUDA_VISIBLE_DEVICES=0 python generate.py \
122
+ --base_model yahma/llama-7b-hf \
123
+ --lora_weights scu-kdde/alpaca-mixlora-7b \
124
+ --template template/alpaca.json \
125
+ --load_8bit \
126
+ --instruction "What is m-LoRA?"
127
  ```
128
 
129
  ## Citation