File size: 3,638 Bytes
2565914
 
 
 
 
 
cb9f573
06a47d6
3d98e91
06a47d6
00d1765
 
b709f64
 
 
 
eeeaccf
 
06a47d6
ad62b10
06a47d6
cb9f573
06a47d6
 
 
 
e30fe5f
06a47d6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
da18764
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
license: other
license_name: other
license_link: LICENSE
---

Model Mixed by [Reborn Merge Method](https://medium.com/@puffanddmx82/reborn-elevating-model-adaptation-with-merging-for-superior-nlp-performance-f604e8e307b2)

Keep in mind that the accuracy of your desired questions may vary for this merge.

Will it be possible to use this merge as a base for future my another merge work?

I hope this merge model combines information and grammar appropriately so that it doesn't just give strange, nonsensical answers. Then I can make new cool food with the next merge...

ps : What I am saying above is not to say that each model is strange. It means I could be doing the merge wrong. I hope there is no misunderstanding.

I am open for the "Collaboration & ETC" if you want

```
Reborn Merge Information

[models info]
reference_model_name = "MLP-KTLim/llama-3-Korean-Bllossom-8B"
base_model_name = "NousResearch/Meta-Llama-3-8B-Instruct"
target_model_name = "maum-ai/Llama-3-MAAL-8B-Instruct-v0.1"

[interpolating mismatch part vocab]
Interpolating tensor 'model.embed_tokens.weight' to match the shape: torch.Size([145088, 4096]) vs torch.Size([128256, 4096])
Interpolating tensor 'lm_head.weight' to match the shape: torch.Size([145088, 4096]) vs torch.Size([128256, 4096])
Interpolating tensor 'model.embed_tokens.weight' to match the shape: torch.Size([128256, 4096]) vs torch.Size([128257, 4096])
Interpolating tensor 'lm_head.weight' to match the shape: torch.Size([128256, 4096]) vs torch.Size([128257, 4096])
```

Ollama Create
```
jaylee@lees-MacBook-Pro-2  % ./ollama create Joah -f ./gguf/Joah-Llama-3-MAAL-MLP-KoEn-8B-Reborn/Modelfile_Q5_K_M 
transferring model data 
creating model layer 
creating template layer 
creating system layer 
creating parameters layer 
creating config layer 
using already created layer sha256:4eadb53f0c70683aeab133c60d76b8ffc9f41ca5d49524d4b803c19e5ce7e3a5 
using already created layer sha256:8ab4849b038cf0abc5b1c9b8ee1443dca6b93a045c2272180d985126eb40bf6f 
writing layer sha256:ae2974c64ea5d6f488eeb1b10717a270f48fb3452432589db6f5e60472ae96ac 
writing layer sha256:74ef6315972b317734fe01e7e1ad5b49fce1fa8ed3978cb66501ecb8c3a2e984 
writing layer sha256:83882a5e957b8ce0d454f26bcedb2819413b49d6b967b28d60edb8ac61edfa58 
writing manifest 
success 
```

MODELFILE
```
FROM joah-llama-3-maal-mlp-koen-8b-reborn-Q5_K_M.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""


SYSTEM """
μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜.
"""

PARAMETER num_keep 24
PARAMETER temperature 0.7
PARAMETER num_predict 3000
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
```

## Citation
**Language Model**
```text
@misc{bllossom,
  author = {ChangSu Choi, Yongbin Jeong, Seoyoon Park, InHo Won, HyeonSeok Lim, SangMin Kim, Yejee Kang, Chanhyuk Yoon, Jaewan Park, Yiseul Lee, HyeJin Lee, Younggyun Hahm, Hansaem Kim, KyungTae Lim},
  title = {Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean},
  year = {2024},
  journal = {LREC-COLING 2024},
  paperLink = {\url{https://arxiv.org/pdf/2403.10882}},
 },
}

@article{llama3modelcard,

  title={Llama 3 Model Card},

  author={AI@Meta},

  year={2024},

  url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}

}
```