There currently is an issue with the model generating random reserved special tokens (like "<|reserved_special_token_49|>") at the end. Please use with skip_special_tokens=true. We will update once we found the reason for this behaviour. If you found a solution, please let us know!

Llama 3 DiscoLM German 8b v0.1 Experimental

Introduction

Llama 3 DiscoLM German 8b v0.1 Experimental is an experimental Llama 3 based version of DiscoLM German.

This is an experimental release and not intended for production use. The model is still in development and will be updated with new features and improvements in the future.

Please find a online Demo here (we may take this offline for updates).

Prompt Format

DiscoLM German uses ChatML as the prompt format which enables OpenAI endpoint compatability and is supported by most inference libraries and frontends.

System prompts allow steerability and interesting new ways to interact with an LLM, guiding rules, roles, and stylistic choices of the model.

<|im_start|>system
Du bist ein hilfreicher Assistent.<|im_end|>
<|im_start|>user
Wer bist du?<|im_end|>
<|im_start|>assistant
Ich bin ein Sprachmodell namens DiscoLM German und ich wurde von DiscoResearch trainiert.<|im_end|>

This prompt is available as a chat template, which means you can format messages using the tokenizer.apply_chat_template() method:

messages = [
    {"role": "system", "content": "Du bist ein hilfreicher Assistent."},
    {"role": "user", "content": "Wer bist du?"}
]
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt")
model.generate(**gen_input)

When tokenizing messages for generation, set add_generation_prompt=True when calling apply_chat_template(). This will append <|im_start|>assistant\n to your prompt, to ensure that the model continues with an assistant response.

Example Code for Inference

model_id = "DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "Du bist ein hilfreicher Assistent."},
    {"role": "user", "content": "Wer bist du?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

Limitations & Biases

This model can produce factually incorrect and offensive output, and should not be relied on to produce factually accurate information. This model was trained on various public datasets. While great efforts have been taken to clean the pretraining data, it is possible that this model could generate biased or otherwise offensive outputs and it is the responsibility of the user to implement a safety/moderation layer. Please use with caution.

License

This model is distributed under the META LLAMA 3 COMMUNITY LICENSE, see LICENSE for more information.

Acknowledgements

Built with Meta Llama 3.

DiscoLM German is a DiscoResearch project, a collective effort by JP Harries, Björn Plüster and Daniel Auras.

Development of Llama 3 DiscoLM German 8b was sponsored by ellamind. Compute was sponsored generously by sysGen GmbH.

About DiscoResearch

DiscoResearch is an aspiring open research community for AI enthusiasts and LLM hackers. Come join our Discord, share your opinions and ideas, and advance open LLM research with us!

Disclaimer

The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. This model should only be deployed with additional safety measures in place.