Edit model card

Lite-Jamba-6x181M-intermediate-30M

This is a compact base model is inspired by Jamba and has approximately 1.09 billion parameters, with 362 million active parameters.

Tokenizer

The model uses a tokenizer from the Mixtral repository with some modifications.

Training Details

This intermediate checkpoint has been trained on 30 million tokens.

Evaluation Metrics

Tasks Version Filter n-shot Metric Value Stderr
winogrande 1 none 0 acc 0.4886 ± 0.0140
piqa 1 none 0 acc 0.5343 ± 0.0116
none 0 acc_norm 0.5218 ± 0.0117
openbookqa 1 none 0 acc 0.1180 ± 0.0144
none 0 acc_norm 0.2560 ± 0.0195
hellaswag 1 none 0 acc 0.2571 ± 0.0044
none 0 acc_norm 0.2492 ± 0.0043

Risk Disclaimer

By using this model, you acknowledge that you understand and assume the risks associated with its use. You are solely responsible for ensuring compliance with all applicable laws and regulations. We disclaim any liability for problems arising from the use of this open-source model, including but not limited to direct, indirect, incidental, consequential, or punitive damages. We make no warranties, express or implied, regarding the model's performance, accuracy, or fitness for a particular purpose. Your use of this model is at your own risk, and you agree to hold harmless and indemnify us, our affiliates, and our contributors from any claims, damages, or expenses arising from your use of the model.

Downloads last month
309
Safetensors
Model size
1.09B params
Tensor type
F32
·