metadata
license: mit
An unofficial reproduced PRepBN-Llama-350M checkpoints for SLAB.
Model Sources [optional]
- Repository: [https://github.com/xinghaochen/SLAB/]
- Paper [optional]: [https://arxiv.org/abs/2405.11582]
Evaluation
https://github.com/xinghaochen/SLAB/tree/main/llama
python evaluation.py --ckpt <checkpoint-path>
BibTeX:
@inproceedings{guo2024slab,
title={SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization},
author={Guo, Jialong and Chen, Xinghao and Tang, Yehui and Wang, Yunhe},
booktitle={International Conference on Machine Learning},
year={2024}
}