lifelongeeek
/

vic_critT_20pr

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

LLMNewbie commited on Mar 20

Commit

8999e6a

•

1 Parent(s): 37bf98d

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -5,4 +5,14 @@ language:
 - zh
 ---
-Pruned language model originated from Vicuna-13B.

 - zh
 ---
+This model is a weight-pruned large language model originated from Vicuna-13B.
+Language model pruning is a technique used to reduce the size and computational requirements of language models,
+making them more efficient for deployment without significantly sacrificing their performance or accuracy.
+This model uses structured pruning instead of unstructured pruning.
+The structured pruning removes entire units or channels (e.g., neurons, layers, or filter channels in trnasformer).
+This approach can lead to more efficient computational gains since it aligns better with how hardware utilizes data,
+but it may have a more significant impact on model performance.
+However, the unstructured pruning, remove individual weights across the model without regard to the structure of the network.
+While it can lead to significant reductions in model size,
+it may not always translate to speed gains since the resulting sparse matrices might not be efficiently handled by all hardware.