|
--- |
|
license: other |
|
library_name: transformers |
|
tags: |
|
- generated_from_trainer |
|
base_model: Qwen/Qwen2.5-3B |
|
license_name: qwen-research |
|
license_link: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE |
|
model-index: |
|
- name: outputs/gelato-3b |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
Prompt Format: **ChatML** |
|
|
|
This is an experimental which was heavily optimized for reasoning tasks and not meant for production-use. |
|
|
|
GGUFs: https://huggingface.co/mradermacher/raspberry-3B-GGUF |
|
|
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_arcee-ai__raspberry-3B) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |15.40| |
|
|IFEval (0-Shot) |31.54| |
|
|BBH (3-Shot) |19.53| |
|
|MATH Lvl 5 (4-Shot)| 7.63| |
|
|GPQA (0-shot) | 3.69| |
|
|MuSR (0-shot) | 9.41| |
|
|MMLU-PRO (5-shot) |20.60| |
|
|
|
|