second-state
/

Llama-3-8B-Instruct-262k-GGUF

Text Generation

Model card Files Files and versions Community

juntaoyuan commited on Apr 29

Commit

74cf9e2

•

1 Parent(s): f5a2c14

Update context size

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -50,7 +50,7 @@ tags:
     {{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
     ```
-- Context size: `4096`
 - Run as LlamaEdge service
@@ -58,7 +58,7 @@ tags:
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-3-8B-Instruct-262k-Q5_K_M.gguf \
     llama-api-server.wasm \
     --prompt-template llama-3-chat \
-    --ctx-size 4096 \
     --model-name llama-3-8B-instruct-262k
   ```
@@ -68,7 +68,7 @@ tags:
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-3-8B-Instruct-262k-Q5_K_M.gguf \
     llama-chat.wasm \
     --prompt-template llama-3-chat \
-    --ctx-size 4096
   ```
 ## Quantized GGUF Models

     {{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
     ```
+- Context size: `262144`
 - Run as LlamaEdge service
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-3-8B-Instruct-262k-Q5_K_M.gguf \
     llama-api-server.wasm \
     --prompt-template llama-3-chat \
+    --ctx-size 262144 \
     --model-name llama-3-8B-instruct-262k
   ```
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-3-8B-Instruct-262k-Q5_K_M.gguf \
     llama-chat.wasm \
     --prompt-template llama-3-chat \
+    --ctx-size 262144
   ```
 ## Quantized GGUF Models