scukdde-llm
/

alpaca-mixlora-7b

Model card Files Files and versions Community

mikecovlee commited on Jan 27

Commit

7221067

•

1 Parent(s): 674303e

Update README.md

Files changed (1) hide show

README.md +9 -2

README.md CHANGED Viewed

@@ -15,6 +15,13 @@ In addition, MixLoRA also allows simultaneous fine-tuning of the attention layer
 MixLoRA exists within m-LoRA in a specific adapter form. Consequently, m-LoRA is capable of simultaneously loading, training, and fine-tuning multiple distinct MixLoRA models. However, it's essential to note that these models must be based on the same pre-trained model.
 ## Configuration of MixLoRA
 Compared with LoRA, MixLoRA have some additional configurations.
@@ -132,14 +139,14 @@ Please cite the repo if you use the code in this repo.
 @misc{alpaca-mixlora-7b,
   author = {Dengchun, Li and Tingfeng, Lan and Zhengmao, Ye and Lei, Duan and Mingjie, Tang},
   title = {MixLoRA MoE model based on AlpacaCleaned dataset and LLaMA-7B base model},
-  year = {2023},
   publisher = {HuggingFace Hub},
   howpublished = {\url{https://huggingface.co/scu-kdde/alpaca-mixlora-7b}},
 }
 ```
 ## Copyright
-Copyright © 2023 All Rights Reserved.
 This project is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

 MixLoRA exists within m-LoRA in a specific adapter form. Consequently, m-LoRA is capable of simultaneously loading, training, and fine-tuning multiple distinct MixLoRA models. However, it's essential to note that these models must be based on the same pre-trained model.
+## MMLU Scores
+|Model|Configuration|MMLU Average|STEM|Social Sciences|Humanities|Other|
+|-----------------|---------------------------------|--------|--------|--------|--------|--------|
+|Alpaca-LoRA-7B   |LoRA Rank = 16, QKVO             |  24.2  |  24.1  |**25.0**|  25.2  |  22.7  |
+|Alpaca-MixLoRA-7B|LoRA Rank = 8, Top-2 of 8 Experts|**25.5**|**26.1**|  23.3  |**25.3**|**26.9**|
 ## Configuration of MixLoRA
 Compared with LoRA, MixLoRA have some additional configurations.
 @misc{alpaca-mixlora-7b,
   author = {Dengchun, Li and Tingfeng, Lan and Zhengmao, Ye and Lei, Duan and Mingjie, Tang},
   title = {MixLoRA MoE model based on AlpacaCleaned dataset and LLaMA-7B base model},
+  year = {2024},
   publisher = {HuggingFace Hub},
   howpublished = {\url{https://huggingface.co/scu-kdde/alpaca-mixlora-7b}},
 }
 ```
 ## Copyright
+Copyright © 2023-2024 All Rights Reserved.
 This project is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).