yixinsong commited on
Commit
e1381e5
1 Parent(s): c6ddad5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -3
README.md CHANGED
@@ -1,3 +1,29 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # Model Card for TurboSparse-Mistral
5
+ The TurboSparse-Mixtral Large Language Model (LLM) is an sparsified version of the Mixtral.
6
+
7
+ <img src="takeaway.png" alt="avatar" width="300" height="200"/>
8
+
9
+ The average performance is evaluated using benchmarks from the OpenLLM Leaderboard.
10
+
11
+ ## Inference
12
+
13
+ Our code for accelerating TurboSparse-Mixtral is currently being refined. Stay tuned! Now you can run this model like dense model.
14
+
15
+ ## Chat-Template
16
+
17
+ During sparsification, we also utilize some SFT datasets.
18
+ We take ChatML as our chat template:
19
+ ```
20
+ <|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n
21
+ ```
22
+
23
+ ## Allow Finetuning
24
+
25
+ As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mistral with any framework and algorithm.
26
+
27
+ ## License
28
+
29
+ The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.