LLMNewbie
commited on
Commit
•
8999e6a
1
Parent(s):
37bf98d
Update README.md
Browse files
README.md
CHANGED
@@ -5,4 +5,14 @@ language:
|
|
5 |
- zh
|
6 |
---
|
7 |
|
8 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
- zh
|
6 |
---
|
7 |
|
8 |
+
This model is a weight-pruned large language model originated from Vicuna-13B.
|
9 |
+
Language model pruning is a technique used to reduce the size and computational requirements of language models,
|
10 |
+
making them more efficient for deployment without significantly sacrificing their performance or accuracy.
|
11 |
+
|
12 |
+
This model uses structured pruning instead of unstructured pruning.
|
13 |
+
The structured pruning removes entire units or channels (e.g., neurons, layers, or filter channels in trnasformer).
|
14 |
+
This approach can lead to more efficient computational gains since it aligns better with how hardware utilizes data,
|
15 |
+
but it may have a more significant impact on model performance.
|
16 |
+
However, the unstructured pruning, remove individual weights across the model without regard to the structure of the network.
|
17 |
+
While it can lead to significant reductions in model size,
|
18 |
+
it may not always translate to speed gains since the resulting sparse matrices might not be efficiently handled by all hardware.
|