File size: 1,409 Bytes
d7e51df 8403e30 d7e51df 8403e30 a7d71ec 8403e30 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
---
license: llama2
language:
- en
---
# ProSparse-LLaMA-2-7B-GGUF
- Original model: [SparseLLM/ProSparse-LLaMA-2-7B](https://huggingface.co/SparseLLM/prosparse-llama-2-7b)
- Converted & distributed by: [THUNLP](https://nlp.csai.tsinghua.edu.cn/), [ModelBest](modelbest.cn), and [PowerInfer](https://huggingface.co/PowerInfer)
This model is the downstream distribution of [SparseLLM/ProSparse-LLaMA-2-7B](https://huggingface.co/SparseLLM/prosparse-llama-2-7b) in PowerInfer GGUF format consisting of the LLM model weights and predictor weights.
Note: `prosparse-llama-2-7b-clip15.gguf` is a variant GGUF version with the same model but different activation predictors, which are trained with data only reserving top 15% activation values. Compared with `prosparse-llama-2-7b.gguf`, this variant has higher predicted sparsity and inference speed, but suffering from relatively lower activation recall.
### Citation
Please kindly cite using the following BibTeX:
```bibtex
@article{song2024prosparse,
title={{ProSparse}: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models},
author={Song, Chenyang and Han, Xu and Zhang, Zhengyan and Hu, Shengding and Shi, Xiyu and Li, Kuai and Chen, Chen and Liu, Zhiyuan and Li, Guangli and Yang, Tao and Sun, Maosong},
year={2024},
journal={arXiv preprint arXiv:2402.13516},
url={https://arxiv.org/pdf/2402.13516.pdf}
}
```
|