Edit model card

Creative writing has never been so accesible, palmer goes beyond what it was thought about small language models. This model is a "MErging of Experts" (MEoE) using palmer-002-2401 as base, biased as an assistant without using any prompts—as a result of these efforts—palmer is better than most 1b language models on most benchmarks, despite being sometimes 40% smaller than its counterparts.

                  MMLU     ARC-C    OBQA   HellaSwag  PIQA  Winogrande Average
tinyllama-chat | 0.2470 | 0.3285 | 0.3740 | 0.6037 | 0.7448 | 0.6022 | 0.4833 |
zyte-1b	       | 0.2397 | 0.3353 | 0.3700 | 0.6086 | 0.7541 | 0.5998 | 0.4845 |
palmer-002.5   | 0.2534 | 0.3370 | 0.3740 | 0.6128 | 0.7486 | 0.6535 | 0.4965 |
qwen-1-8       | 0.4536 | 0.3490 | 0.3320 | 0.5876 | 0.7307 | 0.5896 | 0.5070 |

This work constitutes, given its compactness, an advancement towards SMLs, easily empowering edge devices such as mobile phones, raspberry pis and automated software/robots. Aditionally, palmer-002.5 deviates its main philosophy from palmer-family to become a more powerful model with more data instead of less.

prompt: Reality is but
output: a dream,
And the dreams we make are our reality.

The world is a canvas, painted by our minds,
And we can make it a masterpiece.

So let us create, let us dream,
And let our imagination run wild.

For in our imagination lies our power,
To create a world that is truly our own.

You can support me through kofi

Note that since this model uses a transformer architecture as any popular language model, its output sometimes contains hallucinations (make mistakes or false statements), and as such, it must be used with caution on sensitive scenarios.

Downloads last month
2,615
Safetensors
Model size
1.1B params
Tensor type
FP16
·