mikecovlee
commited on
Commit
•
c28e12d
1
Parent(s):
345be52
Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ datasets:
|
|
5 |
---
|
6 |
# MixLoRA: Resource-Efficient Model with Mix-of-Experts Architecture for Enhanced LoRA Performance
|
7 |
|
8 |
-
<div align="left"><img src="
|
9 |
|
10 |
Large Language Models (LLMs) have showcased exceptional performance across a wide array of Natural Language Processing (NLP) tasks. Fine-tuning techniques are commonly utilized to tailor pre-trained models to specific applications. While methods like LoRA have effectively tackled GPU memory constraints during fine-tuning, their applicability is often restricted. On the other hand, Mix-of-Expert (MoE) models, such as Mixtral 8x7B, demonstrate remarkable performance while maintaining a reduced parameter count. However, the resource requirements of these models pose challenges, particularly for consumer-grade GPUs.
|
11 |
|
|
|
5 |
---
|
6 |
# MixLoRA: Resource-Efficient Model with Mix-of-Experts Architecture for Enhanced LoRA Performance
|
7 |
|
8 |
+
<div align="left"><img src="MixLoRA.png" width=60%"></div>
|
9 |
|
10 |
Large Language Models (LLMs) have showcased exceptional performance across a wide array of Natural Language Processing (NLP) tasks. Fine-tuning techniques are commonly utilized to tailor pre-trained models to specific applications. While methods like LoRA have effectively tackled GPU memory constraints during fine-tuning, their applicability is often restricted. On the other hand, Mix-of-Expert (MoE) models, such as Mixtral 8x7B, demonstrate remarkable performance while maintaining a reduced parameter count. However, the resource requirements of these models pose challenges, particularly for consumer-grade GPUs.
|
11 |
|