Update README.md

9dd8790 verified 6 months ago

3.64 kB

	---
	license: other
	license_name: other
	license_link: LICENSE
	---

	Model Mixed by [Reborn Merge Method](https://medium.com/@puffanddmx82/reborn-elevating-model-adaptation-with-merging-for-superior-nlp-performance-f604e8e307b2)

	Keep in mind that the accuracy of your desired questions may vary for this merge.

	Will it be possible to use this merge as a base for future my another merge work?

	I hope this merge model combines information and grammar appropriately so that it doesn't just give strange, nonsensical answers. Then I can make new cool food with the next merge...

	ps : What I am saying above is not to say that each model is strange. It means I could be doing the merge wrong. I hope there is no misunderstanding.

	I am open for the "Collaboration & ETC" if you want

	```
	Reborn Merge Information

	[models info]
	reference_model_name = "MLP-KTLim/llama-3-Korean-Bllossom-8B"
	base_model_name = "NousResearch/Meta-Llama-3-8B-Instruct"
	target_model_name = "maum-ai/Llama-3-MAAL-8B-Instruct-v0.1"

	[interpolating mismatch part vocab]
	Interpolating tensor 'model.embed_tokens.weight' to match the shape: torch.Size([145088, 4096]) vs torch.Size([128256, 4096])
	Interpolating tensor 'lm_head.weight' to match the shape: torch.Size([145088, 4096]) vs torch.Size([128256, 4096])
	Interpolating tensor 'model.embed_tokens.weight' to match the shape: torch.Size([128256, 4096]) vs torch.Size([128257, 4096])
	Interpolating tensor 'lm_head.weight' to match the shape: torch.Size([128256, 4096]) vs torch.Size([128257, 4096])
	```

	Ollama Create
	```
	jaylee@lees-MacBook-Pro-2 % ./ollama create Joah -f ./gguf/Joah-Llama-3-MAAL-MLP-KoEn-8B-Reborn/Modelfile_Q5_K_M
	transferring model data
	creating model layer
	creating template layer
	creating system layer
	creating parameters layer
	creating config layer
	using already created layer sha256:4eadb53f0c70683aeab133c60d76b8ffc9f41ca5d49524d4b803c19e5ce7e3a5
	using already created layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f
	writing layer sha256:ae2974c64ea5d6f488eeb1b10717a270f48fb3452432589db6f5e60472ae96ac
	writing layer sha256:74ef6315972b317734fe01e7e1ad5b49fce1fa8ed3978cb66501ecb8c3a2e984
	writing layer sha256:83882a5e957b8ce0d454f26bcedb2819413b49d6b967b28d60edb8ac61edfa58
	writing manifest
	success
	```

	MODELFILE
	```
	FROM joah-llama-3-maal-mlp-koen-8b-reborn-Q5_K_M.gguf
	TEMPLATE """{{ if .System }}<\|start_header_id\|>system<\|end_header_id\|>

	{{ .System }}<\|eot_id\|>{{ end }}{{ if .Prompt }}<\|start_header_id\|>user<\|end_header_id\|>

	{{ .Prompt }}<\|eot_id\|>{{ end }}<\|start_header_id\|>assistant<\|end_header_id\|>

	{{ .Response }}<\|eot_id\|>"""


	SYSTEM """
	친절한 챗봇으로서 상대방의 요청에 최대한 자세하고 친절하게 답하자. 모든 대답은 한국어(Korean)으로 대답해줘.
	"""

	PARAMETER num_keep 24
	PARAMETER temperature 0.7
	PARAMETER num_predict 3000
	PARAMETER stop "<\|start_header_id\|>"
	PARAMETER stop "<\|end_header_id\|>"
	PARAMETER stop "<\|eot_id\|>"
	```

	## Citation
	Language Model
	```text
	@misc{bllossom,
	author = {ChangSu Choi, Yongbin Jeong, Seoyoon Park, InHo Won, HyeonSeok Lim, SangMin Kim, Yejee Kang, Chanhyuk Yoon, Jaewan Park, Yiseul Lee, HyeJin Lee, Younggyun Hahm, Hansaem Kim, KyungTae Lim},
	title = {Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean},
	year = {2024},
	journal = {LREC-COLING 2024},
	paperLink = {\url{https://arxiv.org/pdf/2403.10882}},
	},
	}

	@article{llama3modelcard,

	title={Llama 3 Model Card},

	author={AI@Meta},

	year={2024},

	url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}

	}
	```