LSX-UniWue
/

LLaMmlein_1B_chat_selected

Generated from Trainer

Model card Files Files and versions Community

LLaMmlein_1B_chat_selected / README.md

JanPf's picture

Update README.md

515d480 verified 1 day ago

|

history blame contribute delete

1.85 kB

	---
	library_name: peft
	base_model: LSX-UniWue/LLaMmlein_1B
	tags:
	- trl
	- sft
	- generated_from_trainer
	model-index:
	- name: LLaMmlein_1b_chat_all
	results: []
	datasets:
	- LSX-UniWue/Guanako
	- FreedomIntelligence/sharegpt-deutsch
	- FreedomIntelligence/alpaca-gpt4-deutsch
	language:
	- de
	license: other
	---

	# LLäMmlein 1B Chat

	This is a chat adapter for the German Tinyllama 1B language model.
	Find more details on our [page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) and our [preprint](arxiv.org/abs/2411.11171)!

	## Run it
	```py
	import torch
	from peft import PeftConfig, PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	torch.manual_seed(42)

	# script config
	base_model_name = "LSX-UniWue/LLaMmlein_1B"
	chat_adapter_name = "LSX-UniWue/LLaMmlein_1B_chat_selected"
	device = "mps" # or cuda

	# chat history
	messages = [
	{
	"role": "user",
	"content": """Na wie geht's?""",
	},
	]

	# load model
	config = PeftConfig.from_pretrained(chat_adapter_name)
	base_model = model = AutoModelForCausalLM.from_pretrained(
	base_model_name,
	attn_implementation="flash_attention_2" if device == "cuda" else None,
	torch_dtype=torch.bfloat16,
	device_map=device,
	)
	base_model.resize_token_embeddings(32064)
	model = PeftModel.from_pretrained(base_model, chat_adapter_name)
	tokenizer = AutoTokenizer.from_pretrained(chat_adapter_name)

	# encode message in "ChatML" format
	chat = tokenizer.apply_chat_template(
	messages,
	return_tensors="pt",
	add_generation_prompt=True,
	).to(device)

	# generate response
	print(
	tokenizer.decode(
	model.generate(
	chat,
	max_new_tokens=300,
	pad_token_id=tokenizer.pad_token_id,
	eos_token_id=tokenizer.eos_token_id,
	)[0],
	skip_special_tokens=False,
	)
	)

	```