davidkim205 commited on
Commit
9f94c6b
β€’
1 Parent(s): be5502f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -103
README.md CHANGED
@@ -12,113 +12,16 @@ tags:
12
  - llama-2
13
  - llama-2-chat
14
  license: apache-2.0
15
- library_name: peft
16
  ---
17
- # komt-Llama-2-7b-chat-hf-ggml
18
-
19
- https://github.com/davidkim205/komt
20
-
21
- This model quantized the [korean Llama 2 7B-chat](https://huggingface.co/davidkim205/komt-Llama-2-7b-chat-hf) using [llama.cpp](https://github.com/ggerganov/llama.cpp) to 4-bit quantization.
22
-
23
-
24
- Our model, being in the same format as TheBloke's ggml, supports the following libraries or UI.
25
-
26
-
27
- The following content references [TheBloke/Llama-2-13B-chat-GGML](https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML#metas-llama-2-13b-chat-ggml).
28
-
29
- GGML files are for CPU + GPU inference using [llama.cpp](https://github.com/ggerganov/llama.cpp) and libraries and UIs which support this format, such as:
30
- * [KoboldCpp](https://github.com/LostRuins/koboldcpp), a powerful GGML web UI with full GPU acceleration out of the box. Especially good for story telling.
31
- * [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with GPU acceleration via the c_transformers backend.
32
- * [LM Studio](https://lmstudio.ai/), a fully featured local GUI. Supports full GPU accel on macOS. Also supports Windows, without GPU accel.
33
- * [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most popular web UI. Requires extra steps to enable GPU accel via llama.cpp backend.
34
- * [ctransformers](https://github.com/marella/ctransformers), a Python library with LangChain support and OpenAI-compatible AI server.
35
- * [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with OpenAI-compatible API server.
36
 
 
 
 
37
 
38
  ## Model Details
39
 
40
  * **Model Developers** : davidkim(changyeon kim)
41
  * **Repository** : https://github.com/davidkim205/komt
42
- * **quant methods** : q4_0, q4_1, q5_0, q5_1, q2_k, q3_k, q3_k_m, q3_k_l, q4_k, q4_k_s, q4_k_m, q5_k, q5_k_s, q5_k_m, q8_0, q4_0
43
-
44
- ## Prompt Template
45
- ```
46
- ### instruction: {prompt}
47
-
48
- ### Response:
49
- ```
50
- Examples:
51
- ```
52
- ### instruction: μžλ™μ°¨ μ’…ν•©(μ •κΈ°)검사 μ˜λ¬΄κΈ°κ°„μ€ μ–Όλ§ˆμΈκ°€μš”?
53
-
54
- ### Response:
55
-
56
- ```
57
- response:
58
- ```
59
- ### instruction: μžλ™μ°¨ μ’…ν•©(μ •κΈ°)검사 μ˜λ¬΄κΈ°κ°„μ€ μ–Όλ§ˆμΈκ°€μš”?
60
-
61
- ### Response:μžλ™μ°¨ μ’…ν•©(μ •κΈ°)κ²€μ‚¬λŠ” 2λ…„
62
- 1991λ…„ 7μ›” 1일에 κ³ μ‹œλœ 'μžλ™μ°¨ λ³΄ν—˜λ£Œ μ‘°μ •κΈ°μ€€'μ—μ„œ μ·¨λ¦¬λ‘œλΆ€ν„° μ œμ •λœ κΈ°μ€€ 상 κ²½λŸ‰ μ‚΄μˆ˜μ°¨λ₯Ό μ œμ™Έν•œ μžλ™μ°¨ λͺ¨λ“  μŠΉμš©μžλ™μ°¨λŠ” 2λ…„λ§ˆλ‹€ ν•„μš”ν•˜λ‹€. 이 법은 μ°¨λŸ‰μ— 관계없이 2λ…„λ§ˆλ‹€ 정기검사λ₯Ό ν•΄μ•Όν•œλ‹€κ³  κ·œμ œν–ˆλ‹€.
63
- ```
64
-
65
-
66
- ## Usage
67
-
68
- When using the original [llama.cpp](https://github.com/ggerganov/llama.cpp)
69
- ```
70
- make -j && ./main -m ./models/komt-Llama-2-7b-chat-hf-ggml/ggml-model-q4_0.bin -p "### instruction: λˆ„μ „μ°¨λ‹¨κΈ°κ°€ λ‚΄λ €κ°€λŠ” μ΄μœ λŠ” λ¬΄μ—‡μž… λ‹ˆκΉŒ?\n\n### Response:"
71
-
72
- ```
73
- When using the modified llama.cpp for korean multi-task (recommended):
74
- Refer https://github.com/davidkim205/komt/tree/main/llama.cpp
75
- ```
76
- make -j && ./main -m ./models/komt-Llama-2-7b-chat-hf-ggml/ggml-model-q4_0.bin -p "λˆ„μ „μ°¨λ‹¨κΈ°κ°€ λ‚΄λ €κ°€λŠ” μ΄μœ λŠ” λ¬΄μ—‡μž… λ‹ˆκΉŒ?"
77
- ```
78
- response:
79
- ```
80
- $ make -j && ./main -m ./models/komt-Llama-2-7b-chat-hf-ggml/ggml-model-q4_0.bin -p "λˆ„μ „μ°¨λ‹¨κΈ°κ°€ λ‚΄λ €κ°€λŠ” μ΄μœ λŠ” λ¬΄μ—‡μž… λ‹ˆκΉŒ?"
81
- I llama.cpp build info:
82
- I UNAME_S: Linux
83
- I UNAME_P: x86_64
84
- I UNAME_M: x86_64
85
- I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS
86
- I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS
87
- I LDFLAGS:
88
- I CC: cc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
89
- I CXX: g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
90
-
91
- make: Nothing to be done for 'default'.
92
- main: build = 987 (3ebb009)
93
- main: seed = 1692168046
94
- llama.cpp: loading model from ./models/komt-Llama-2-7b-chat-hf-ggml/ggml-model-q4_0.bin
95
- llama_model_load_internal: format = ggjt v3 (latest)
96
- llama_model_load_internal: n_vocab = 32000
97
- llama_model_load_internal: n_ctx = 512
98
- llama_model_load_internal: n_embd = 4096
99
- llama_model_load_internal: n_mult = 5504
100
- llama_model_load_internal: n_head = 32
101
- llama_model_load_internal: n_head_kv = 32
102
- llama_model_load_internal: n_layer = 32
103
- llama_model_load_internal: n_rot = 128
104
- llama_model_load_internal: n_gqa = 1
105
- llama_model_load_internal: rnorm_eps = 5.0e-06
106
- llama_model_load_internal: n_ff = 11008
107
- llama_model_load_internal: freq_base = 10000.0
108
- llama_model_load_internal: freq_scale = 1
109
- llama_model_load_internal: ftype = 2 (mostly Q4_0)
110
- llama_model_load_internal: model size = 7B
111
- llama_model_load_internal: ggml ctx size = 0.08 MB
112
- llama_model_load_internal: mem required = 3647.96 MB (+ 256.00 MB per state)
113
- llama_new_context_with_model: kv self size = 256.00 MB
114
- llama_new_context_with_model: compute buffer total size = 71.84 MB
115
-
116
- system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
117
- sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
118
- generate: n_ctx = 512, n_batch = 512, n_predict = -1, n_keep = 0
119
-
120
-
121
- ### instruction: λˆ„μ „μ°¨λ‹¨κΈ°κ°€ λ‚΄λ €κ°€λŠ” μ΄μœ λŠ” λ¬΄μ—‡μž…λ‹ˆκΉŒ?
122
-
123
- ### Response:λˆ„μ „μ°¨λ‹¨κΈ°κ°€ λ‚΄λ €κ°€λŠ” μ΄μœ λŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€:1. κ³ μž₯μ΄λ‚˜ μ˜€μž‘λ™ 확인: λˆ„μ „μ°¨λ‹¨κΈ°κ°€ λͺ‡ μ°¨λ‘€ λ“€μ–΄μ˜€λ©΄ κ³ μž₯이 λ‚˜κ±°λ‚˜ μ˜€μž‘λ™μ„ λ°©μ§€ν•˜λŠ” 데 도움이 λ©λ‹ˆλ‹€.2. λˆ„μ „ 사고 ν”Όν•΄: λ§Žμ€ λˆ„μ „μ°¨λ‹¨κΈ°κ°€ λ‚΄λ €κ°€λ©΄ μ§€μ—­μ—μ„œ μΌμ–΄λ‚˜λŠ” λˆ„μ „ μ‚¬κ³ μ˜ 영ν–₯을 μ€„μ΄λŠ” κ²ƒμœΌλ‘œ λ‚˜νƒ€λ‚¬μŠ΅λ‹ˆλ‹€.3. μ•ˆμ •μ„±: λˆ„μ „μ°¨λ‹¨κΈ°κ°€ λ‚΄λ €κ°€λ©΄ μ „λ°˜μ μΈ μ•ˆμ •μ„±μ΄ ν–₯μƒλ©λ‹ˆλ‹€.
124
- ```
 
12
  - llama-2
13
  - llama-2-chat
14
  license: apache-2.0
 
15
  ---
16
+ # komt : korean multi task instruction tuning model
17
+ ![multi task instruction tuning.jpg](https://github.com/davidkim205/komt/assets/16680469/c7f6ade7-247e-4b62-a94f-47e19abea68e)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
+ Recently, due to the success of ChatGPT, numerous large language models have emerged in an attempt to catch up with ChatGPT's capabilities.
20
+ However, when it comes to Korean language performance, it has been observed that many models still struggle to provide accurate answers or generate Korean text effectively.
21
+ This study addresses these challenges by introducing a multi-task instruction technique that leverages supervised datasets from various tasks to create training data for Large Language Models (LLMs).
22
 
23
  ## Model Details
24
 
25
  * **Model Developers** : davidkim(changyeon kim)
26
  * **Repository** : https://github.com/davidkim205/komt
27
+ * **quant methods** : q4_0, q4_1, q5_0, q5_1, q2_k, q3_k, q3_k_m, q3_k_l, q4_k, q4_k_s, q4_k_m, q5_k, q5_k_s, q5_k_m, q8_0, q4_0