Lite-Jamba-6x181M-intermediate-30M
This is a compact base model is inspired by Jamba and has approximately 1.09 billion parameters, with 362 million active parameters.
Tokenizer
The model uses a tokenizer from the Mixtral repository with some modifications.
Training Details
This intermediate checkpoint has been trained on 30 million tokens.
Evaluation Metrics
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
---|---|---|---|---|---|---|---|
winogrande | 1 | none | 0 | acc | 0.4886 | ± | 0.0140 |
piqa | 1 | none | 0 | acc | 0.5343 | ± | 0.0116 |
none | 0 | acc_norm | 0.5218 | ± | 0.0117 | ||
openbookqa | 1 | none | 0 | acc | 0.1180 | ± | 0.0144 |
none | 0 | acc_norm | 0.2560 | ± | 0.0195 | ||
hellaswag | 1 | none | 0 | acc | 0.2571 | ± | 0.0044 |
none | 0 | acc_norm | 0.2492 | ± | 0.0043 |
Risk Disclaimer
By using this model, you acknowledge that you understand and assume the risks associated with its use. You are solely responsible for ensuring compliance with all applicable laws and regulations. We disclaim any liability for problems arising from the use of this open-source model, including but not limited to direct, indirect, incidental, consequential, or punitive damages. We make no warranties, express or implied, regarding the model's performance, accuracy, or fitness for a particular purpose. Your use of this model is at your own risk, and you agree to hold harmless and indemnify us, our affiliates, and our contributors from any claims, damages, or expenses arising from your use of the model.
- Downloads last month
- 309