renillhuang
commited on
Commit
β’
d28971c
1
Parent(s):
386af23
Update README_ko.md
Browse files- README_ko.md +33 -4
README_ko.md
CHANGED
@@ -32,7 +32,7 @@
|
|
32 |
- [π λͺ¨ν μκ°](#model-introduction)
|
33 |
- [π λ€μ΄λ‘λ κ²½λ‘](#model-download)
|
34 |
- [π νκ°κ²°κ³Ό](#model-benchmark)
|
35 |
-
- [π λͺ¨ν μΆλ¦¬](#model-inference)
|
36 |
- [π μ±λͺ
ν©μ](#declarations-license)
|
37 |
- [π₯ κΈ°μ
μκ°](#company-introduction)
|
38 |
|
@@ -265,10 +265,39 @@ CUDA_VISIBLE_DEVICES=0 python demo/text_generation_base.py --model OrionStarAI/O
|
|
265 |
CUDA_VISIBLE_DEVICES=0 python demo/text_generation.py --model OrionStarAI/Orion-14B-Chat --tokenizer OrionStarAI/Orion-14B-Chat --prompt μλ
. μ΄λ¦μ΄ λμμ
|
266 |
|
267 |
```
|
|
|
268 |
|
269 |
-
|
|
|
270 |
|
271 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
272 |
|
273 |
`````
|
274 |
μ¬μ©μοΌμλ
,μ΄λ¦μ΄ λμμ
|
@@ -295,7 +324,7 @@ Orion-14BοΌμμ μ μμ΄λΌλ μ΄λ¦° μλ
μ΄ μμλ€. κ·Έλ μμ λ§
|
|
295 |
μ΄ μ΄μΌκΈ°λ μ ν¬μκ² μ©κΈ°μ κ²°μ¬μ΄ μλ€λ©΄ λͺ¨λ μ΄λ €μμ 극볡νκ³ μμ μ κΏμ μ΄λ£° μ μλ€λ κ²μ μλ €μ€λ€.
|
296 |
`````
|
297 |
|
298 |
-
### 4.
|
299 |
|
300 |
`````
|
301 |
η¨ζ·οΌθͺε·±γη΄Ήδ»γγ¦γγ γγ
|
|
|
32 |
- [π λͺ¨ν μκ°](#model-introduction)
|
33 |
- [π λ€μ΄λ‘λ κ²½λ‘](#model-download)
|
34 |
- [π νκ°κ²°κ³Ό](#model-benchmark)
|
35 |
+
- [π λͺ¨ν μΆλ¦¬](#model-inference)[<img src="./assets/imgs/vllm.png" alt="vllm" height="20"/>](#vllm) [<img src="./assets/imgs/llama_cpp.png" alt="llamacpp" height="20"/>](#llama-cpp)
|
36 |
- [π μ±λͺ
ν©μ](#declarations-license)
|
37 |
- [π₯ κΈ°μ
μκ°](#company-introduction)
|
38 |
|
|
|
265 |
CUDA_VISIBLE_DEVICES=0 python demo/text_generation.py --model OrionStarAI/Orion-14B-Chat --tokenizer OrionStarAI/Orion-14B-Chat --prompt μλ
. μ΄λ¦μ΄ λμμ
|
266 |
|
267 |
```
|
268 |
+
## 4.4. vLLM μΆλ‘ μ ν΅ν΄
|
269 |
|
270 |
+
- νλ‘μ νΈ μ£Όμ<br>
|
271 |
+
https://github.com/vllm-project/vllm
|
272 |
|
273 |
+
- ν 리νμ€νΈ<br>
|
274 |
+
https://github.com/vllm-project/vllm/pull/2539
|
275 |
+
|
276 |
+
|
277 |
+
<a name="llama-cpp"></a><br>
|
278 |
+
## 4.5. llama.cpp μΆλ‘ μ ν΅ν΄
|
279 |
+
|
280 |
+
- νλ‘μ νΈ μ£Όμ<br>
|
281 |
+
https://github.com/ggerganov/llama.cpp
|
282 |
+
|
283 |
+
- ν 리νμ€νΈ<br>
|
284 |
+
https://github.com/ggerganov/llama.cpp/pull/5118
|
285 |
+
|
286 |
+
- GGUF νμμΌλ‘ λ³ννλ λ°©λ²
|
287 |
+
|
288 |
+
```shell
|
289 |
+
python convert-hf-to-gguf.py path/to/Orion-14B-Chat --outfile chat.gguf
|
290 |
+
```
|
291 |
+
|
292 |
+
- λͺ¨λΈ μΆλ‘ λ°©λ²
|
293 |
+
|
294 |
+
```shell
|
295 |
+
./main --frequency-penalty 0.5 --frequency-penalty 0.5 --top-k 5 --top-p 0.9 -m chat.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e
|
296 |
+
```
|
297 |
+
|
298 |
+
## 4.6. μμ λ
ΈμΆ
|
299 |
+
|
300 |
+
### 4.6.1. μ‘λ΄
|
301 |
|
302 |
`````
|
303 |
μ¬μ©μοΌμλ
,μ΄λ¦μ΄ λμμ
|
|
|
324 |
μ΄ μ΄μΌκΈ°λ μ ν¬μκ² μ©κΈ°μ κ²°μ¬μ΄ μλ€λ©΄ λͺ¨λ μ΄λ €μμ 극볡νκ³ μμ μ κΏμ μ΄λ£° μ μλ€λ κ²μ μλ €μ€λ€.
|
325 |
`````
|
326 |
|
327 |
+
### 4.6.2. νμΌ
|
328 |
|
329 |
`````
|
330 |
η¨ζ·οΌθͺε·±γη΄Ήδ»γγ¦γγ γγ
|