OpenEfficientAI
/

SLAB-Llama-350M

Model card Files Files and versions Community

SLAB-Llama-350M / README.md

OpenEfficientAI's picture

OpenEfficientAI

Update README.md

e330118 verified 4 months ago

|

history blame contribute delete

993 Bytes

	---
	license: mit
	---

	<!-- Provide a quick summary of what the model is/does. -->

	An unofficial reproduced PRepBN-Llama-350M checkpoints for [SLAB](https://github.com/xinghaochen/SLAB/).

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [https://github.com/xinghaochen/SLAB/]
	- Paper [optional]: [https://arxiv.org/abs/2405.11582]


	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	https://github.com/xinghaochen/SLAB/tree/main/llama

	```
	python evaluation.py --ckpt <checkpoint-path>
	```

	[Results](https://github.com/xinghaochen/SLAB/blob/main/docs/llama.png)


	BibTeX:

	```
	@inproceedings{guo2024slab,
	title={SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization},
	author={Guo, Jialong and Chen, Xinghao and Tang, Yehui and Wang, Yunhe},
	booktitle={International Conference on Machine Learning},
	year={2024}
	}
	```