Commit
β’
9f94c6b
1
Parent(s):
be5502f
Update README.md
Browse files
README.md
CHANGED
@@ -12,113 +12,16 @@ tags:
|
|
12 |
- llama-2
|
13 |
- llama-2-chat
|
14 |
license: apache-2.0
|
15 |
-
library_name: peft
|
16 |
---
|
17 |
-
# komt
|
18 |
-
|
19 |
-
https://github.com/davidkim205/komt
|
20 |
-
|
21 |
-
This model quantized the [korean Llama 2 7B-chat](https://huggingface.co/davidkim205/komt-Llama-2-7b-chat-hf) using [llama.cpp](https://github.com/ggerganov/llama.cpp) to 4-bit quantization.
|
22 |
-
|
23 |
-
|
24 |
-
Our model, being in the same format as TheBloke's ggml, supports the following libraries or UI.
|
25 |
-
|
26 |
-
|
27 |
-
The following content references [TheBloke/Llama-2-13B-chat-GGML](https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML#metas-llama-2-13b-chat-ggml).
|
28 |
-
|
29 |
-
GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/ggerganov/llama.cpp) and libraries and UIs which support this format, such as:
|
30 |
-
* [KoboldCpp](https://github.com/LostRuins/koboldcpp), a powerful GGML web UI with full GPU acceleration out of the box. Especially good for story telling.
|
31 |
-
* [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with GPU acceleration via the c_transformers backend.
|
32 |
-
* [LM Studio](https://lmstudio.ai/), a fully featured local GUI. Supports full GPU accel on macOS. Also supports Windows, without GPU accel.
|
33 |
-
* [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most popular web UI. Requires extra steps to enable GPU accel via llama.cpp backend.
|
34 |
-
* [ctransformers](https://github.com/marella/ctransformers), a Python library with LangChain support and OpenAI-compatible AI server.
|
35 |
-
* [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with OpenAI-compatible API server.
|
36 |
|
|
|
|
|
|
|
37 |
|
38 |
## Model Details
|
39 |
|
40 |
* **Model Developers** : davidkim(changyeon kim)
|
41 |
* **Repository** : https://github.com/davidkim205/komt
|
42 |
-
* **quant methods** : q4_0, q4_1, q5_0, q5_1, q2_k, q3_k, q3_k_m, q3_k_l, q4_k, q4_k_s, q4_k_m, q5_k, q5_k_s, q5_k_m, q8_0, q4_0
|
43 |
-
|
44 |
-
## Prompt Template
|
45 |
-
```
|
46 |
-
### instruction: {prompt}
|
47 |
-
|
48 |
-
### Response:
|
49 |
-
```
|
50 |
-
Examples:
|
51 |
-
```
|
52 |
-
### instruction: μλμ°¨ μ’
ν©(μ κΈ°)κ²μ¬ μ무기κ°μ μΌλ§μΈκ°μ?
|
53 |
-
|
54 |
-
### Response:
|
55 |
-
|
56 |
-
```
|
57 |
-
response:
|
58 |
-
```
|
59 |
-
### instruction: μλμ°¨ μ’
ν©(μ κΈ°)κ²μ¬ μ무기κ°μ μΌλ§μΈκ°μ?
|
60 |
-
|
61 |
-
### Response:μλμ°¨ μ’
ν©(μ κΈ°)κ²μ¬λ 2λ
|
62 |
-
1991λ
7μ 1μΌμ κ³ μλ 'μλμ°¨ 보νλ£ μ‘°μ κΈ°μ€'μμ 취리λ‘λΆν° μ μ λ κΈ°μ€ μ κ²½λ μ΄μμ°¨λ₯Ό μ μΈν μλμ°¨ λͺ¨λ μΉμ©μλμ°¨λ 2λ
λ§λ€ νμνλ€. μ΄ λ²μ μ°¨λμ κ΄κ³μμ΄ 2λ
λ§λ€ μ κΈ°κ²μ¬λ₯Ό ν΄μΌνλ€κ³ κ·μ νλ€.
|
63 |
-
```
|
64 |
-
|
65 |
-
|
66 |
-
## Usage
|
67 |
-
|
68 |
-
When using the original [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
69 |
-
```
|
70 |
-
make -j && ./main -m ./models/komt-Llama-2-7b-chat-hf-ggml/ggml-model-q4_0.bin -p "### instruction: λμ μ°¨λ¨κΈ°κ° λ΄λ €κ°λ μ΄μ λ 무μμ
λκΉ?\n\n### Response:"
|
71 |
-
|
72 |
-
```
|
73 |
-
When using the modified llama.cpp for korean multi-task (recommended):
|
74 |
-
Refer https://github.com/davidkim205/komt/tree/main/llama.cpp
|
75 |
-
```
|
76 |
-
make -j && ./main -m ./models/komt-Llama-2-7b-chat-hf-ggml/ggml-model-q4_0.bin -p "λμ μ°¨λ¨κΈ°κ° λ΄λ €κ°λ μ΄μ λ 무μμ
λκΉ?"
|
77 |
-
```
|
78 |
-
response:
|
79 |
-
```
|
80 |
-
$ make -j && ./main -m ./models/komt-Llama-2-7b-chat-hf-ggml/ggml-model-q4_0.bin -p "λμ μ°¨λ¨κΈ°κ° λ΄λ €κ°λ μ΄μ λ 무μμ
λκΉ?"
|
81 |
-
I llama.cpp build info:
|
82 |
-
I UNAME_S: Linux
|
83 |
-
I UNAME_P: x86_64
|
84 |
-
I UNAME_M: x86_64
|
85 |
-
I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS
|
86 |
-
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS
|
87 |
-
I LDFLAGS:
|
88 |
-
I CC: cc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
|
89 |
-
I CXX: g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
|
90 |
-
|
91 |
-
make: Nothing to be done for 'default'.
|
92 |
-
main: build = 987 (3ebb009)
|
93 |
-
main: seed = 1692168046
|
94 |
-
llama.cpp: loading model from ./models/komt-Llama-2-7b-chat-hf-ggml/ggml-model-q4_0.bin
|
95 |
-
llama_model_load_internal: format = ggjt v3 (latest)
|
96 |
-
llama_model_load_internal: n_vocab = 32000
|
97 |
-
llama_model_load_internal: n_ctx = 512
|
98 |
-
llama_model_load_internal: n_embd = 4096
|
99 |
-
llama_model_load_internal: n_mult = 5504
|
100 |
-
llama_model_load_internal: n_head = 32
|
101 |
-
llama_model_load_internal: n_head_kv = 32
|
102 |
-
llama_model_load_internal: n_layer = 32
|
103 |
-
llama_model_load_internal: n_rot = 128
|
104 |
-
llama_model_load_internal: n_gqa = 1
|
105 |
-
llama_model_load_internal: rnorm_eps = 5.0e-06
|
106 |
-
llama_model_load_internal: n_ff = 11008
|
107 |
-
llama_model_load_internal: freq_base = 10000.0
|
108 |
-
llama_model_load_internal: freq_scale = 1
|
109 |
-
llama_model_load_internal: ftype = 2 (mostly Q4_0)
|
110 |
-
llama_model_load_internal: model size = 7B
|
111 |
-
llama_model_load_internal: ggml ctx size = 0.08 MB
|
112 |
-
llama_model_load_internal: mem required = 3647.96 MB (+ 256.00 MB per state)
|
113 |
-
llama_new_context_with_model: kv self size = 256.00 MB
|
114 |
-
llama_new_context_with_model: compute buffer total size = 71.84 MB
|
115 |
-
|
116 |
-
system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
|
117 |
-
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
|
118 |
-
generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0
|
119 |
-
|
120 |
-
|
121 |
-
### instruction: λμ μ°¨λ¨κΈ°κ° λ΄λ €κ°λ μ΄μ λ 무μμ
λκΉ?
|
122 |
-
|
123 |
-
### Response:λμ μ°¨λ¨κΈ°κ° λ΄λ €κ°λ μ΄μ λ λ€μκ³Ό κ°μ΅λλ€:1. κ³ μ₯μ΄λ μ€μλ νμΈ: λμ μ°¨λ¨κΈ°κ° λͺ μ°¨λ‘ λ€μ΄μ€λ©΄ κ³ μ₯μ΄ λκ±°λ μ€μλμ λ°©μ§νλ λ° λμμ΄ λ©λλ€.2. λμ μ¬κ³ νΌν΄: λ§μ λμ μ°¨λ¨κΈ°κ° λ΄λ €κ°λ©΄ μ§μμμ μΌμ΄λλ λμ μ¬κ³ μ μν₯μ μ€μ΄λ κ²μΌλ‘ λνλ¬μ΅λλ€.3. μμ μ±: λμ μ°¨λ¨κΈ°κ° λ΄λ €κ°λ©΄ μ λ°μ μΈ μμ μ±μ΄ ν₯μλ©λλ€.
|
124 |
-
```
|
|
|
12 |
- llama-2
|
13 |
- llama-2-chat
|
14 |
license: apache-2.0
|
|
|
15 |
---
|
16 |
+
# komt : korean multi task instruction tuning model
|
17 |
+
![multi task instruction tuning.jpg](https://github.com/davidkim205/komt/assets/16680469/c7f6ade7-247e-4b62-a94f-47e19abea68e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
+
Recently, due to the success of ChatGPT, numerous large language models have emerged in an attempt to catch up with ChatGPT's capabilities.
|
20 |
+
However, when it comes to Korean language performance, it has been observed that many models still struggle to provide accurate answers or generate Korean text effectively.
|
21 |
+
This study addresses these challenges by introducing a multi-task instruction technique that leverages supervised datasets from various tasks to create training data for Large Language Models (LLMs).
|
22 |
|
23 |
## Model Details
|
24 |
|
25 |
* **Model Developers** : davidkim(changyeon kim)
|
26 |
* **Repository** : https://github.com/davidkim205/komt
|
27 |
+
* **quant methods** : q4_0, q4_1, q5_0, q5_1, q2_k, q3_k, q3_k_m, q3_k_l, q4_k, q4_k_s, q4_k_m, q5_k, q5_k_s, q5_k_m, q8_0, q4_0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|