Text Generation
Transformers
GGUF
mistral
openchat
C-RLFT
text-generation-inference
TheBloke commited on
Commit
5fcde88
1 Parent(s): 5eedfe1

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -19,7 +19,7 @@ model_creator: OpenChat
19
  model_name: Openchat 3.5 1210
20
  model_type: mistral
21
  pipeline_tag: text-generation
22
- prompt_template: 'GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:
23
 
24
  '
25
  quantized_by: TheBloke
@@ -88,10 +88,10 @@ Here is an incomplete list of clients and libraries that are known to support GG
88
  <!-- repositories-available end -->
89
 
90
  <!-- prompt-template start -->
91
- ## Prompt template: OpenChat
92
 
93
  ```
94
- GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:
95
 
96
  ```
97
 
@@ -210,7 +210,7 @@ Windows Command Line users: You can set the environment variable by running `set
210
  Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
211
 
212
  ```shell
213
- ./main -ngl 35 -m openchat-3.5-1210.Q4_K_M.gguf --color -c 8192 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:"
214
  ```
215
 
216
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
@@ -271,7 +271,7 @@ llm = Llama(
271
 
272
  # Simple inference example
273
  output = llm(
274
- "GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:", # Prompt
275
  max_tokens=512, # Generate up to 512 tokens
276
  stop=["</s>"], # Example stop token - not necessarily correct for this specific model! Please check before using.
277
  echo=True # Whether to echo the prompt
 
19
  model_name: Openchat 3.5 1210
20
  model_type: mistral
21
  pipeline_tag: text-generation
22
+ prompt_template: 'GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:
23
 
24
  '
25
  quantized_by: TheBloke
 
88
  <!-- repositories-available end -->
89
 
90
  <!-- prompt-template start -->
91
+ ## Prompt template: OpenChat-Correct
92
 
93
  ```
94
+ GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:
95
 
96
  ```
97
 
 
210
  Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
211
 
212
  ```shell
213
+ ./main -ngl 35 -m openchat-3.5-1210.Q4_K_M.gguf --color -c 8192 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:"
214
  ```
215
 
216
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
 
271
 
272
  # Simple inference example
273
  output = llm(
274
+ "GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:", # Prompt
275
  max_tokens=512, # Generate up to 512 tokens
276
  stop=["</s>"], # Example stop token - not necessarily correct for this specific model! Please check before using.
277
  echo=True # Whether to echo the prompt