Text Generation
Transformers
Inference Endpoints
mikecovlee commited on
Commit
c28e12d
1 Parent(s): 345be52

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -5,7 +5,7 @@ datasets:
5
  ---
6
  # MixLoRA: Resource-Efficient Model with Mix-of-Experts Architecture for Enhanced LoRA Performance
7
 
8
- <div align="left"><img src="./assets/MixLoRA.png" width=60%"></div>
9
 
10
  Large Language Models (LLMs) have showcased exceptional performance across a wide array of Natural Language Processing (NLP) tasks. Fine-tuning techniques are commonly utilized to tailor pre-trained models to specific applications. While methods like LoRA have effectively tackled GPU memory constraints during fine-tuning, their applicability is often restricted. On the other hand, Mix-of-Expert (MoE) models, such as Mixtral 8x7B, demonstrate remarkable performance while maintaining a reduced parameter count. However, the resource requirements of these models pose challenges, particularly for consumer-grade GPUs.
11
 
 
5
  ---
6
  # MixLoRA: Resource-Efficient Model with Mix-of-Experts Architecture for Enhanced LoRA Performance
7
 
8
+ <div align="left"><img src="MixLoRA.png" width=60%"></div>
9
 
10
  Large Language Models (LLMs) have showcased exceptional performance across a wide array of Natural Language Processing (NLP) tasks. Fine-tuning techniques are commonly utilized to tailor pre-trained models to specific applications. While methods like LoRA have effectively tackled GPU memory constraints during fine-tuning, their applicability is often restricted. On the other hand, Mix-of-Expert (MoE) models, such as Mixtral 8x7B, demonstrate remarkable performance while maintaining a reduced parameter count. However, the resource requirements of these models pose challenges, particularly for consumer-grade GPUs.
11