TheBloke
/

openchat-3.5-1210-GGUF

Text Generation

Model card Files Files and versions Community

TheBloke commited on Dec 15, 2023

Commit

5fcde88

•

1 Parent(s): 5eedfe1

Upload README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ model_creator: OpenChat
 model_name: Openchat 3.5 1210
 model_type: mistral
 pipeline_tag: text-generation
-prompt_template: 'GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:
   '
 quantized_by: TheBloke
@@ -88,10 +88,10 @@ Here is an incomplete list of clients and libraries that are known to support GG
 <!-- repositories-available end -->
 <!-- prompt-template start -->
-## Prompt template: OpenChat
 ```
-GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:
 ```
@@ -210,7 +210,7 @@ Windows Command Line users: You can set the environment variable by running `set
 Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
 ```shell
-./main -ngl 35 -m openchat-3.5-1210.Q4_K_M.gguf --color -c 8192 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:"
 ```
 Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
@@ -271,7 +271,7 @@ llm = Llama(
 # Simple inference example
 output = llm(
-  "GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:", # Prompt
   max_tokens=512,  # Generate up to 512 tokens
   stop=["</s>"],   # Example stop token - not necessarily correct for this specific model! Please check before using.
   echo=True        # Whether to echo the prompt

 model_name: Openchat 3.5 1210
 model_type: mistral
 pipeline_tag: text-generation
+prompt_template: 'GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:
   '
 quantized_by: TheBloke
 <!-- repositories-available end -->
 <!-- prompt-template start -->
+## Prompt template: OpenChat-Correct
 ```
+GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:
 ```
 Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
 ```shell
+./main -ngl 35 -m openchat-3.5-1210.Q4_K_M.gguf --color -c 8192 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:"
 ```
 Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
 # Simple inference example
 output = llm(
+  "GPT4 Correct User: {prompt}<|end_of_turn|>GPT4 Correct Assistant:", # Prompt
   max_tokens=512,  # Generate up to 512 tokens
   stop=["</s>"],   # Example stop token - not necessarily correct for this specific model! Please check before using.
   echo=True        # Whether to echo the prompt