Edit model card

image/png

Hermes + Leo = Hermeo

Hermeo-7B

A German-English language model merged from DPOpenHermes-7B-v2 and leo-mistral-hessianai-7b-chat using mergekit. Both base models are fine-tuned versions of Mistral-7B-v0.1.

Model details

How to use

You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we set a seed for reproducibility:

>>> from transformers import pipeline, set_seed
>>> generator = pipeline('text-generation', model='malteos/hermeo-7b')
>>> set_seed(42)
>>> generator("Hallo, Ich bin ein Sprachmodell,", max_length=40, num_return_sequences=1)
[{'generated_text': 'Hallo, Ich bin ein Sprachmodell, das dir bei der Übersetzung von Texten zwischen Deutsch und Englisch helfen kann. Wenn du mir einen Text in Deutsch'}]

Acknowledgements

Evaluation

The evaluation methdology of the Open LLM Leaderboard is followed.

German benchmarks

German tasks: MMLU-DE Hellaswag-DE ARC-DE Average
Models / Few-shots: (5 shots) (10 shots) (24 shots)
7B parameters
llama-2-7b 0.400 0.513 0.381 0.431
leo-hessianai-7b 0.400 0.609 0.429 0.479
bloom-6b4-clp-german 0.274 0.550 0.351 0.392
mistral-7b 0.524 0.588 0.473 0.528
leo-mistral-hessianai-7b 0.481 0.663 0.485 0.543
leo-mistral-hessianai-7b-chat 0.458 0.617 0.465 0.513
DPOpenHermes-7B-v2 0.517 0.603 0.515 0.545
hermeo-7b (this model) 0.511 0.668 0.528 0.569
13B parameters
llama-2-13b 0.469 0.581 0.468 0.506
leo-hessianai-13b 0.486 0.658 0.509 0.551
70B parameters
llama-2-70b 0.597 0.674 0.561 0.611
leo-hessianai-70b 0.653 0.721 0.600 0.658

English benchmarks

English tasks: MMLU Hellaswag ARC Average
Models / Few-shots: (5 shots) (10 shots) (24 shots)
llama-2-7b 0.466 0.786 0.530 0.594
leolm-hessianai-7b 0.423 0.759 0.522 0.568
bloom-6b4-clp-german 0.264 0.525 0.328 0.372
mistral-7b 0.635 0.832 0.607 0.691
leolm-mistral-hessianai-7b 0.550 0.777 0.518 0.615
hermeo-7b (this model) 0.601 0.821 0.620 0.681

Prompting / Prompt Template

Prompt dialogue template (ChatML format):

"""
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""

The model input can contain multiple conversation turns between user and assistant, e.g.

<|im_start|>user
{prompt 1}<|im_end|>
<|im_start|>assistant
{reply 1}<|im_end|>
<|im_start|>user
{prompt 2}<|im_end|>
<|im_start|>assistant
(...)

License

Apache 2.0

See also

Downloads last month
16
Safetensors
Model size
7.24B params
Tensor type
FP16
·

Collection including malteos/hermeo-7b