renillhuang commited on
Commit
139a004
β€’
1 Parent(s): 46b2f07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -4
README.md CHANGED
@@ -45,7 +45,7 @@ tags:
45
  - [πŸ“– Model Introduction](#model-introduction)
46
  - [πŸ”— Model Download](#model-download)
47
  - [πŸ”– Model Benchmark](#model-benchmark)
48
- - [πŸ“Š Model Inference](#model-inference)
49
  - [πŸ“œ Declarations & License](#declarations-license)
50
  - [πŸ₯‡ Company Introduction](#company-introduction)
51
 
@@ -278,9 +278,38 @@ CUDA_VISIBLE_DEVICES=0 python demo/text_generation.py --model OrionStarAI/Orion-
278
 
279
  ```
280
 
281
- ## 4.4 Example Output
282
 
283
- ### 4.4.1. Casual Chat
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
284
 
285
  `````
286
  User: Hello
@@ -302,7 +331,7 @@ User: Tell me a joke.
302
  Orion-14B: Sure, here's a classic one-liner: Why don't scientists trust atoms? Because they make up everything.
303
  `````
304
 
305
- ### 4.4.2. Japanese & Korean Chat
306
 
307
  `````
308
  User:θ‡ͺ己を紹介してください
 
45
  - [πŸ“– Model Introduction](#model-introduction)
46
  - [πŸ”— Model Download](#model-download)
47
  - [πŸ”– Model Benchmark](#model-benchmark)
48
+ - [πŸ“Š Model Inference](#model-inference) [<img src="./assets/imgs/vllm.png" alt="vllm" height="20"/>](#vllm) [<img src="./assets/imgs/llama_cpp.png" alt="llamacpp" height="20"/>](#llama-cpp)
49
  - [πŸ“œ Declarations & License](#declarations-license)
50
  - [πŸ₯‡ Company Introduction](#company-introduction)
51
 
 
278
 
279
  ```
280
 
281
+ ## 4.4. Inference by vllm
282
 
283
+ - Project URL<br>
284
+ https://github.com/vllm-project/vllm
285
+
286
+ - Pull Request<br>
287
+ https://github.com/vllm-project/vllm/pull/2539
288
+
289
+ <a name="llama-cpp"></a><br>
290
+ ## 4.5. Inference by llama.cpp
291
+
292
+ - Project URL<br>
293
+ https://github.com/ggerganov/llama.cpp
294
+
295
+ - Pull Request<br>
296
+ https://github.com/ggerganov/llama.cpp/pull/5118
297
+
298
+ - How to convert to GGUF model
299
+
300
+ ```shell
301
+ python convert-hf-to-gguf.py path/to/Orion-14B-Chat --outfile chat.gguf
302
+ ```
303
+
304
+ - How to run generation
305
+
306
+ ```shell
307
+ ./main --frequency-penalty 0.5 --frequency-penalty 0.5 --top-k 5 --top-p 0.9 -m chat.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e
308
+ ```
309
+
310
+ ## 4.6 Example Output
311
+
312
+ ### 4.6.1. Casual Chat
313
 
314
  `````
315
  User: Hello
 
331
  Orion-14B: Sure, here's a classic one-liner: Why don't scientists trust atoms? Because they make up everything.
332
  `````
333
 
334
+ ### 4.6.2. Japanese & Korean Chat
335
 
336
  `````
337
  User:θ‡ͺ己を紹介してください