|
--- |
|
tags: |
|
- merge |
|
- mergekit |
|
- lazymergekit |
|
- Azure99/blossom-v5-llama3-8b |
|
base_model: |
|
- Azure99/blossom-v5-llama3-8b |
|
license: llama3 |
|
--- |
|
|
|
# llama3-8b-spaetzle-v13 |
|
|
|
llama3-8b-spaetzle-v13 is a merge of the following models: |
|
* [Azure99/blossom-v5-llama3-8b](https://huggingface.co/Azure99/blossom-v5-llama3-8b) |
|
* [VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct](https://huggingface.co/VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct) |
|
|
|
No change in llama3 prompt format template. |
|
|
|
## Benchmarks |
|
It should work not too bad for German or English, e.g. it achieves 64.14(.10) on EQ Bench v2_de with 170(171)/171 per [q4km GGUF](https://huggingface.co/cstr/llama3-8b-spaetzle-v13-GGUF) (bpe fixed). |
|
And for English EQ-Bench Score (v2): 75.59, Parseable: 171. |
|
|
|
| Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | |
|
|------------------------------|------------|--------|-----------|--------|------------|------------|--------| |
|
| cstr/llama3-8b-spaetzle-v13 | 71.26 | 68.69 | 85.05 | 68.06 | 59.43 | 79.24 | 67.1 | |
|
|
|
## Sample output |
|
|
|
``` |
|
>>> What is 7777 + 3333? |
|
To find the sum, I'll add the two numbers together: |
|
|
|
7777 + 3333 = 11110 |
|
|
|
So the answer is 11110. |
|
|
|
>>> I have 10 apples. I find 3 gold coins in the bottom of a river. The river runs near a big city that has something to do with what I can spend the coins on |
|
... . I then lose 4 apples but gain a gold coin. Three birds run into my path and drop 6 apples each. I play an online game and win 6 gold coins but I have to |
|
... share them equally with my 2 teammates. I buy apples for all the coins I have. The price of an apple is 0.5 coins. How many apples do I have? And where i |
|
... s the river? |
|
Let's break down what you've gained and lost: |
|
|
|
* You found 3 gold coins in the river. |
|
* You lost 4 apples, so let's subtract those from your initial 10: 10 - 4 = 6 apples left. |
|
* Then you gained a gold coin, but that doesn't affect the number of apples. |
|
* The birds drop 6 apples each, and there are three birds, so that's an extra 18 apples: 6 (per bird) * 3 (birds) = 18. |
|
* You now have a total of 6 (initial apples) + 18 (from birds) = 24 apples. |
|
* In the online game, you won 6 gold coins, but you had to share them equally with your teammates. That means each of you got 2 coins: 6 / 3 = 2 coins |
|
per person. |
|
* You now have a total of 3 (initial coins) + 1 (new coin) + 2 (game winnings, shared) = 6 gold coins. |
|
* With the 6 gold coins, you can buy apples at a rate of 0.5 coins per apple. To find out how many apples you can buy, divide your total coins by the |
|
cost per apple: 6 / 0.5 = 12 apples. |
|
* Since you already have 24 apples, let's add the new ones to that: 24 + 12 = 36 apples. |
|
|
|
So, you now have a total of 36 apples. |
|
|
|
As for the river, it's not specified which city it runs near, so we can't pinpoint its exact location. But at least you found those coins! |
|
|
|
>>> Two years ago, I was three times as old as my brother was. In three years, I will be twice as old as my brother. How old is each of us now? |
|
Let's define the current age of you and your brother with variables. |
|
|
|
You are Y years old. |
|
Your brother is X years old. |
|
|
|
Two years ago: |
|
You were Y - 2 years old (because we're looking back in time). |
|
Your brother was X - 2 years old. |
|
|
|
Since two years ago, you were three times as old as your brother was, the equation would be: |
|
|
|
Y - 2 = 3(X - 2) |
|
|
|
Simplify and solve for X: |
|
|
|
Y - 2 = 3X - 6 |
|
Y = 3X - 4 |
|
|
|
Now we know that Y (your current age) is three times your brother's age minus four. Let's use this to find your current age. |
|
|
|
In three years: |
|
You will be Y + 3 years old. |
|
Your brother will be X + 3 years old. |
|
|
|
Since in three years, you'll be twice as old as your brother: |
|
|
|
Y + 3 = 2(X + 3) |
|
|
|
Simplify and solve for X (your brother's current age): |
|
|
|
Y + 3 = 2X + 6 |
|
Y - 2X = 3 |
|
(3X - 4) - 2X = 3 |
|
X - 4 = 3 |
|
X = 7 |
|
|
|
Now that we have your brother's current age, we can find yours: |
|
|
|
Y = 3X - 4 |
|
= 3(7) - 4 |
|
= 21 - 4 |
|
= 17 |
|
|
|
So you are currently 17 years old, and your brother is 7 years old. |
|
``` |
|
|
|
## 🧩 Configuration |
|
|
|
```yaml |
|
models: |
|
- model: VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct |
|
# no parameters necessary for base model |
|
- model: Azure99/blossom-v5-llama3-8b |
|
parameters: |
|
density: 0.65 |
|
weight: 0.4 |
|
merge_method: dare_ties |
|
base_model: VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct |
|
parameters: |
|
int8_mask: true |
|
dtype: bfloat16 |
|
random_seed: 0 |
|
tokenizer_source: base |
|
``` |
|
|
|
## 💻 Usage |
|
|
|
```python |
|
!pip install -qU transformers accelerate |
|
|
|
from transformers import AutoTokenizer |
|
import transformers |
|
import torch |
|
|
|
model = "cstr/llama3-8b-spaetzle-v13" |
|
messages = [{"role": "user", "content": "What is a large language model?"}] |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model) |
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
pipeline = transformers.pipeline( |
|
"text-generation", |
|
model=model, |
|
torch_dtype=torch.float16, |
|
device_map="auto", |
|
) |
|
|
|
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) |
|
print(outputs[0]["generated_text"]) |
|
``` |