Lite-Jamba-6x181M-intermediate-30M

This is a compact base model is inspired by Jamba and has approximately 1.09 billion parameters, with 362 million active parameters.

Tokenizer

The model uses a tokenizer from the Mixtral repository with some modifications.

Training Details

This intermediate checkpoint has been trained on 30 million tokens.

Evaluation Metrics

Tasks	Version	Filter	Metric	Value		Stderr
winogrande	1	none	acc	0.4886	±	0.0140
piqa	1	none	acc	0.5343	±	0.0116
		none	acc_norm	0.5218	±	0.0117
openbookqa	1	none	acc	0.1180	±	0.0144
		none	acc_norm	0.2560	±	0.0195
hellaswag	1	none	acc	0.2571	±	0.0044
		none	acc_norm	0.2492	±	0.0043

Risk Disclaimer

By using this model, you acknowledge that you understand and assume the risks associated with its use. You are solely responsible for ensuring compliance with all applicable laws and regulations. We disclaim any liability for problems arising from the use of this open-source model, including but not limited to direct, indirect, incidental, consequential, or punitive damages. We make no warranties, express or implied, regarding the model's performance, accuracy, or fitness for a particular purpose. Your use of this model is at your own risk, and you agree to hold harmless and indemnify us, our affiliates, and our contributors from any claims, damages, or expenses arising from your use of the model.