SLAB-Llama-350M / README.md
OpenEfficientAI's picture
Update README.md
e330118 verified
---
license: mit
---
<!-- Provide a quick summary of what the model is/does. -->
An unofficial reproduced PRepBN-Llama-350M checkpoints for [SLAB](https://github.com/xinghaochen/SLAB/).
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** [https://github.com/xinghaochen/SLAB/]
- **Paper [optional]:** [https://arxiv.org/abs/2405.11582]
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
https://github.com/xinghaochen/SLAB/tree/main/llama
```
python evaluation.py --ckpt <checkpoint-path>
```
[Results](https://github.com/xinghaochen/SLAB/blob/main/docs/llama.png)
**BibTeX:**
```
@inproceedings{guo2024slab,
title={SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization},
author={Guo, Jialong and Chen, Xinghao and Tang, Yehui and Wang, Yunhe},
booktitle={International Conference on Machine Learning},
year={2024}
}
```