File size: 5,395 Bytes
b9cfc7b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9c14fe5
b9cfc7b
afb917e
b9cfc7b
f2c6aa7
b9cfc7b
 
 
 
 
997a65d
b9cfc7b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51cc7a5
b9cfc7b
 
 
51cc7a5
b9cfc7b
 
 
 
 
 
 
 
 
 
 
 
51cc7a5
 
 
bc4d787
51cc7a5
 
4bbf81e
 
 
 
 
 
 
 
51cc7a5
 
 
 
 
4bbf81e
51cc7a5
 
4bbf81e
 
 
 
51cc7a5
 
 
 
 
 
 
 
 
b9cfc7b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
---
base_model:
- maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
- beomi/Llama-3-KoEn-8B-Instruct-preview
- asiansoul/Llama-3-Open-Ko-Linear-8B
- NousResearch/Meta-Llama-3-8B
- NousResearch/Meta-Llama-3-8B-Instruct
- ajibawa-2023/Code-Llama-3-8B
- defog/llama-3-sqlcoder-8b
- NousResearch/Hermes-2-Pro-Llama-3-8B
- Locutusque/llama-3-neural-chat-v2.2-8B
- asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1
library_name: transformers
tags:
- mergekit
- merge

---
# Joah-Llama-3-KoEn-8B-Coder-v2

<a href="https://ibb.co/k8hmBF4"><img src="https://i.ibb.co/J7z3tPv/Screenshot-2024-05-11-at-7-48-08-PM.png" alt="Screenshot-2024-05-11-at-7-48-08-PM" border="0"></a>

"A cool merge model with swag"

"Joah" by AsianSoul

Soon Multi Language Model Merge based on this. First German Start (Korean / English / German) 🌍

Where to use Joah : Medical, Korean, English, Translation, Code, Science... πŸŽ₯ 

<u>Strengthened SQL code & Other Sci compared to V1</u>

## 🎑 Merge Details


The performance of this merge model doesn't seem to be bad though.-> Just opinion ^^ 🏟️

This may not be a model that satisfies you. But if we continue to overcome our shortcomings, 

Won't we someday find the answer we want?

Don't worry even if you don't get the results you want. 

I'll find the answer for you.

Soon real PoSE to extend Llama's context length to 64k with using my merge method : [reborn](https://medium.com/@puffanddmx82/reborn-elevating-model-adaptation-with-merging-for-superior-nlp-performance-f604e8e307b2)

I have found that most of merge's model outside so far do not actually have 64k in their configs. I will improve it in the next merge with my reborn. If that doesn't work, I guess I'll have to find another way, right?

256k is not possible. My computer is running out of memory. 

If you support me, i will try it on a computer with maximum specifications, also, i would like to conduct great tests by building a network with high-capacity traffic and high-speed 10G speeds for you.

### 🧢 Merge Method

This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base.

### πŸ“š Models Merged

The following models were included in the merge:
* [maum-ai/Llama-3-MAAL-8B-Instruct-v0.1](https://huggingface.co/maum-ai/Llama-3-MAAL-8B-Instruct-v0.1)
* [beomi/Llama-3-KoEn-8B-Instruct-preview](https://huggingface.co/beomi/Llama-3-KoEn-8B-Instruct-preview)
* [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B)
* [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
* [ajibawa-2023/Code-Llama-3-8B](https://huggingface.co/ajibawa-2023/Code-Llama-3-8B)
* [defog/llama-3-sqlcoder-8b](https://huggingface.co/defog/llama-3-sqlcoder-8b)
* [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B)
* [Locutusque/llama-3-neural-chat-v2.2-8B](https://huggingface.co/Locutusque/llama-3-neural-chat-v2.2-8B)
* [asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1](https://huggingface.co/asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1)

### πŸ›Ή Ollama

Modelfile_Q5_K_M 

```
FROM joah-llama-3-koen-8b-coder-v2-Q5_K_M.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""


SYSTEM """
μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜.
"""

PARAMETER num_keep 24
PARAMETER temperature 0.7
PARAMETER num_predict 3000
PARAMETER num_ctx 64000
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
```

```
ollama create joah -f ./Modelfile_Q5_K_M 
```

Modelfile_Q5_K_M default, i hope you to test many upload file for my repo to change that and create ollama

### 🍎 Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: NousResearch/Meta-Llama-3-8B
    # Base model providing a general foundation without specific parameters

  - model: NousResearch/Meta-Llama-3-8B-Instruct
    parameters:
      density: 0.60  
      weight: 0.25  
  
  - model: beomi/Llama-3-KoEn-8B-Instruct-preview
    parameters:
      density: 0.55  
      weight: 0.15  
  
  - model: asiansoul/Llama-3-Open-Ko-Linear-8B
    parameters:
      density: 0.55  
      weight: 0.1  

  - model: maum-ai/Llama-3-MAAL-8B-Instruct-v0.1
    parameters:
      density: 0.55  
      weight: 0.1 

  - model: asiansoul/Joah-Llama-3-KoEn-8B-Coder-v1
    parameters:
      density: 0.55  
      weight: 0.2
      
  - model: ajibawa-2023/Code-Llama-3-8B
    parameters:
      density: 0.55  
      weight: 0.05  

  - model: defog/llama-3-sqlcoder-8b
    parameters:
      density: 0.55  
      weight: 0.1  

  - model: Locutusque/llama-3-neural-chat-v2.2-8B
    parameters:
      density: 0.55  
      weight: 0.1 

  - model: NousResearch/Hermes-2-Pro-Llama-3-8B
    parameters:
      density: 0.55  
      weight: 0.05 

merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3-8B
parameters:
  int8_mask: true
dtype: bfloat16


```