Upload 4 files
Browse files- .gitattributes +1 -0
- README.md +93 -0
- config.json +5 -0
- imatrix-20k_random_data.dat +3 -0
- kiqu.webp +0 -0
.gitattributes
CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
imatrix-20k_random_data.dat filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -1,3 +1,96 @@
|
|
1 |
---
|
2 |
license: cc-by-sa-4.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-sa-4.0
|
3 |
+
language:
|
4 |
+
- ko
|
5 |
+
- en
|
6 |
+
model_creator: maywell
|
7 |
+
model_name: kiqu-70b
|
8 |
+
model_type: mistral
|
9 |
+
prompt_template: '[INST] {prompt} [/INST]
|
10 |
+
|
11 |
+
'
|
12 |
+
quantized_by: noopSD
|
13 |
---
|
14 |
+
|
15 |
+
> This repo contains quantized large language model(LLM) weight files in GGUF format for [maywell/kiqu-70b](https://huggingface.co/maywell/kiqu-70b). The quantized model files are calibrated with [20k_random_data.txt](https://github.com/ggerganov/llama.cpp/files/13970111/20k_random_data.txt)
|
16 |
+
|
17 |
+
# **kiqu-70b** [(Arena Leaderboard)](https://huggingface.co/spaces/instructkr/ko-chatbot-arena-leaderboard)
|
18 |
+
|
19 |
+
<img src="./kiqu.webp" alt="kiqu-70B" width="390"/>
|
20 |
+
|
21 |
+
**kiqu-70b** is a SFT+DPO trained model based on Miqu-70B-Alpaca-DPO using **Korean** datasets.
|
22 |
+
|
23 |
+
Since this model is finetune of miqu-1-70b using it on commercial purposes is at your own risk. โ leaked early version Mistral-Medium
|
24 |
+
|
25 |
+
๋ณธ ๋ชจ๋ธ **kiqu-70b**๋ Miqu-70B-Alpaca-DPO ๋ชจ๋ธ์ ๊ธฐ๋ฐ์ผ๋ก **ํ๊ตญ์ด** ๋ฐ์ดํฐ์
์ ์ฌ์ฉํ์ฌ SFT+DPO ํ๋ จ์ ์งํํ์ฌ ์ ์๋์์ต๋๋ค.
|
26 |
+
|
27 |
+
๋ฒ ์ด์ค ๋ชจ๋ธ์ธ miqu-1-70b ๋ชจ๋ธ์ด ๋ฏธ์คํธ๋-๋ฏธ๋์์ ์ด๊ธฐ ์ ์ถ ๋ฒ์ ์ด๊ธฐ์ ์์
์ ์ฌ์ฉ์ ๋ํ risk๋ ๋ณธ์ธ์๊ฒ ์์ต๋๋ค.
|
28 |
+
|
29 |
+
Beside that this model follows **cc-by-sa-4.0**
|
30 |
+
|
31 |
+
๋ณธ ๋ชจ๋ธ ์์ฒด๋ก์๋ **cc-by-sa-4.0**์ ๋ฐ๋ฆ
๋๋ค.
|
32 |
+
|
33 |
+
# **Model Details**
|
34 |
+
|
35 |
+
**Base Model**
|
36 |
+
miqu-1-70b (Early Mistral-Medium)
|
37 |
+
|
38 |
+
**Instruction format**
|
39 |
+
|
40 |
+
It follows **Mistral** format.
|
41 |
+
Giving few-shots to model is highly recommended
|
42 |
+
|
43 |
+
๋ณธ ๋ชจ๋ธ์ ๋ฏธ์คํธ๋ ํฌ๋งท์ ๋ฐ๋ฆ
๋๋ค.
|
44 |
+
few-shot ์ฌ์ฉ์ ์ ๊ทน ๊ถ์ฅํฉ๋๋ค.
|
45 |
+
|
46 |
+
```
|
47 |
+
[INST] {instruction}
|
48 |
+
[/INST] {output}
|
49 |
+
```
|
50 |
+
|
51 |
+
Multi-shot
|
52 |
+
|
53 |
+
```
|
54 |
+
[INST] {instruction}
|
55 |
+
[/INST] {output}
|
56 |
+
|
57 |
+
[INST] {instruction}
|
58 |
+
[/INST] {output}
|
59 |
+
|
60 |
+
[INST] {instruction}
|
61 |
+
[/INST] {output}
|
62 |
+
.
|
63 |
+
.
|
64 |
+
.
|
65 |
+
```
|
66 |
+
|
67 |
+
**Recommended Template** - 1-shot with system prompt
|
68 |
+
|
69 |
+
```
|
70 |
+
๋๋ kiqu-70B๋ผ๋ ํ๊ตญ์ด์ ํนํ๋ ์ธ์ด๋ชจ๋ธ์ด์ผ. ๊น๋ํ๊ณ ์์ฐ์ค๋ฝ๊ฒ ๋๋ตํด์ค!
|
71 |
+
[INST] ์๋
?
|
72 |
+
[/INST] ์๋
ํ์ธ์! ๋ฌด์์ ๋์๋๋ฆด๊น์? ์ง๋ฌธ์ด๋ ๊ถ๊ธํ ์ ์ด ์๋ค๋ฉด ์ธ์ ๋ ์ง ๋ง์ํด์ฃผ์ธ์.
|
73 |
+
|
74 |
+
[INST] {instruction}
|
75 |
+
[/INST]
|
76 |
+
```
|
77 |
+
|
78 |
+
Trailing space after [/INST] can affect models performance in significant margin. So, when doing inference it is recommended to not include trailing space in chat template.
|
79 |
+
|
80 |
+
[/INST] ๋ค์ ๋์ด์ฐ๊ธฐ๋ ๋ชจ๋ธ ์ฑ๋ฅ์ ์ ์๋ฏธํ ์ํฅ์ ๋ฏธ์นฉ๋๋ค. ๋ฐ๋ผ์, ์ธํผ๋ฐ์ค(์ถ๋ก )๊ณผ์ ์์๋ ์ฑ ํ
ํ๋ฆฟ์ ๋์ด์ฐ๊ธฐ๋ฅผ ์ ์ธํ๋ ๊ฒ์ ์ ๊ทน ๊ถ์ฅํฉ๋๋ค.
|
81 |
+
|
82 |
+
# **Model Benchmark**
|
83 |
+
|
84 |
+
TBD
|
85 |
+
|
86 |
+
# **Author's Message**
|
87 |
+
|
88 |
+
This model's training got sponsered by no one but support from people around Earth.
|
89 |
+
|
90 |
+
[Support Me](https://www.buymeacoffee.com/mwell)
|
91 |
+
|
92 |
+
[Discord Server](https://discord.gg/MrBt3PXdXc)
|
93 |
+
|
94 |
+
Contact Me on Discord - is.maywell
|
95 |
+
|
96 |
+
Follow me on twitter - https://twitter.com/stablefluffy
|
config.json
ADDED
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"architectures": [
|
3 |
+
"LlamaForCausalLM"
|
4 |
+
]
|
5 |
+
}
|
imatrix-20k_random_data.dat
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:751debf93409471426055211d47bc7386dce4c95d7c4274bb45ce7d7635b3845
|
3 |
+
size 24922254
|
kiqu.webp
ADDED