princeton-nlp
/

Sheared-LLaMA-2.7B-ShareGPT

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

princeton-nlp commited on Nov 22, 2023

Commit

4b70c28

•

1 Parent(s): e434e33

Create README.md

Files changed (1) hide show

README.md +33 -0

README.md ADDED Viewed

	@@ -0,0 +1,33 @@

+---
+license: apache-2.0
+---
+**Paper**: [https://arxiv.org/pdf/2310.06694.pdf](https://arxiv.org/pdf/2310.06694.pdf)
+**Code**: https://github.com/princeton-nlp/LLM-Shearing
+**Models**: [Sheared-LLaMA-1.3B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-1.3B), [Sheared-LLaMA-2.7B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-2.7B)
+## Training information
+This is the instruction tuned version of [princeton-nlp/Sheared-LLaMA-2.7B](https://huggingface.co/princeton-nlp/Sheared-LLaMA-2.7B). We trained the base model on 10,000 instruction-response pairs
+sampled from the ShareGPT dataset (first-turns only). We use the following prompt to perform instruction tuning.
+> You are a helpful assistant. Write a response that appropriately completes the request.\n\n### Input:\n{input}\n\n### Response:
+This model can be loaded through transformers.LlamaModelForCausalLM as follows:
+```
+from transformers import LlamaModelForCausalLM
+model = LlamaModelForCausalLM.from_pretrained("princeton-nlp/Sheared-LLaMA-1.3B-ShareGPT")
+```
+## Bibtex
+If you find our model useful, consider citing us with:
+```
+@article{xia2023sheared,
+   title={Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning},
+   author={Xia, Mengzhou and Gao, Tianyu, and Zeng, Zhiyuan and Chen, Danqi},
+   year={2023}
+}
+```