Paper: https://arxiv.org/pdf/2310.06694.pdf
Code: https://github.com/princeton-nlp/LLM-Shearing

License: Must comply with license of Pythia since it's a model derived from Pythia.

Sheared-Pythia-160m is a model pruned and further pre-trained from EleutherAI/pythia-410m. We dynamically load data from different domains in the Pile dataset to prune and contune pre-train the model. We use 0.4B tokens for pruning and 50B tokens for continued pre-training the pruned model. This model can be loaded with HuggingFace via

model = GPTNeoXForCausalLM.from_pretrained("princeton-nlp/Sheared-Pythia-160m")

The model's overall performance is better than EleutherAI/pythia-160m.

Bibtex

@article{xia2023sheared,
  title={Sheared llama: Accelerating language model pre-training via structured pruning},
  author={Xia, Mengzhou and Gao, Tianyu and Zeng, Zhiyuan and Chen, Danqi},
  journal={arXiv preprint arXiv:2310.06694},
  year={2023}
}
Downloads last month
515
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for princeton-nlp/Sheared-Pythia-160m

Finetunes
2 models
Quantizations
1 model

Collection including princeton-nlp/Sheared-Pythia-160m