|
|
|
|
|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
- sft |
|
--- |
|
|
|
<style> |
|
@import url('https://fonts.googleapis.com/css2?family=Vollkorn:ital,wght@0,400..900;1,400..900&display=swap'); |
|
</style> |
|
|
|
<div style="background-color: #101010; border-radius: .5rem; padding: 2rem; font-family: monospace; font-size: .85rem; text-align: justify;"> |
|
|
|
 |
|
|
|
|
|
#### palmer turbo |
|
|
|
This model has a slightly different architecture and training style: |
|
|
|
1. The model was followed by a continual pretraining (lm_head + embedding layers were tuned). |
|
2. Base model was pretrained on 75k instruction/response pairs and merged. |
|
3. Similar architecture than palmer series but smaller in context size (8192) |
|
|
|
In short, palmer is now half the size, twice the speed and almost same overall performance with a notable improvement on mmlu and arc challenge instead of winogrande. As of Wed 17 Jul, it beats all models =< 0.5b on hellaswag. |
|
|
|
As all palmer models, the model is biased to respond to answers without using any specific prompt, feel free to further fine-tune it for your specific use case. |
|
|
|
#### benchmarks |
|
|
|
These are zero-shot evaluations performed on current state-of-the-art language models. |
|
|
|
| Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average | |
|
|--------------------------------|-------|-------|-----------|--------|------------|---------| |
|
| smollm-360m | 0.2537|0.3626| 0.5350 | 0.7116 | 0.5659 | 0.4858 | |
|
| tinyllama | 0.2577| 0.3029| 0.5935 | 0.7329 | 0.5959 | 0.4966 | |
|
| qwen2-0.5b |**0.4413**| 0.2892| 0.4905 | 0.6931 | 0.5699 | 0.4968 | |
|
| danube3-500m-chat (current sota)| 0.2554|0.3626|0.6072 | 0.7432 | 0.6140 | 0.5164 | |
|
| palmer-004-turbo |0.2736|0.3558|**0.6179**|0.7367 | 0.6117 |**0.5191**| |
|
| palmer-004 | 0.2661| 0.3490| 0.6173 |**0.7481**|**0.6417** |**0.5244**| |
|
|
|
|
|
#### thanks to |
|
|
|
- h2oai: performant base model provider |
|
- teknium: openhermes dataset provider |
|
- unsloth: tooling for training software |
|
</div> |
|
|