second-state
/

Mistral-Nemo-Instruct-2407-GGUF

Text Generation

Transformers

GGUF

mistral

conversational

Model card Files Files and versions Community

apepkuss79 commited on Jul 18

Commit

1e3654f

•

1 Parent(s): 154d007

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -33,7 +33,9 @@ language:
 ## Run with LlamaEdge
-- LlamaEdge version: [v0.12.3](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.12.3)
 - Prompt template
@@ -43,11 +45,11 @@ language:
     ```text
     <s>[INST] {user_message_1} [/INST]{assistant_message_1}</s>[INST] {user_message_2} [/INST]{assistant_message_2}</s>
-    ```
 - Context size: `128000`
-- Run as LlamaEdge service
   ```bash
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Mistral-Nemo-Instruct-2407-Q5_K_M.gguf \
@@ -64,8 +66,8 @@ language:
     llama-chat.wasm \
     --prompt-template mistral-instruct \
     --ctx-size 128000
-  ```
 ## Quantized GGUF Models
 | Name | Quant method | Bits | Size | Use case |
@@ -82,6 +84,6 @@ language:
 | [Mistral-Nemo-Instruct-2407-Q5_K_S.gguf](https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF/blob/main/Mistral-Nemo-Instruct-2407-Q5_K_S.gguf)     | Q5_K_S | 5 | 5 GB| large, low quality loss - recommended |
 | [Mistral-Nemo-Instruct-2407-Q6_K.gguf](https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF/blob/main/Mistral-Nemo-Instruct-2407-Q6_K.gguf)     | Q6_K   | 6 | 5.95 GB| very large, extremely low quality loss |
 | [Mistral-Nemo-Instruct-2407-Q8_0.gguf](https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF/blob/main/Mistral-Nemo-Instruct-2407-Q8_0.gguf)     | Q8_0   | 8 | 7.7 GB| very large, extremely low quality loss - not recommended |
-| [Mistral-Nemo-Instruct-2407-f16.gguf](https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF/blob/main/Mistral-Nemo-Instruct-2407-f16.gguf)     | f16   | 16 | 14.5 GB|  |
 *Quantized with llama.cpp b3405.*

 ## Run with LlamaEdge
+- LlamaEdge version: coming soon
+<!-- - LlamaEdge version: [v0.12.3](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.12.3)
 - Prompt template
     ```text
     <s>[INST] {user_message_1} [/INST]{assistant_message_1}</s>[INST] {user_message_2} [/INST]{assistant_message_2}</s>
+    ``` -->
 - Context size: `128000`
+<!-- - Run as LlamaEdge service
   ```bash
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:Mistral-Nemo-Instruct-2407-Q5_K_M.gguf \
     llama-chat.wasm \
     --prompt-template mistral-instruct \
     --ctx-size 128000
+  ``` -->
+<!--
 ## Quantized GGUF Models
 | Name | Quant method | Bits | Size | Use case |
 | [Mistral-Nemo-Instruct-2407-Q5_K_S.gguf](https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF/blob/main/Mistral-Nemo-Instruct-2407-Q5_K_S.gguf)     | Q5_K_S | 5 | 5 GB| large, low quality loss - recommended |
 | [Mistral-Nemo-Instruct-2407-Q6_K.gguf](https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF/blob/main/Mistral-Nemo-Instruct-2407-Q6_K.gguf)     | Q6_K   | 6 | 5.95 GB| very large, extremely low quality loss |
 | [Mistral-Nemo-Instruct-2407-Q8_0.gguf](https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF/blob/main/Mistral-Nemo-Instruct-2407-Q8_0.gguf)     | Q8_0   | 8 | 7.7 GB| very large, extremely low quality loss - not recommended |
+| [Mistral-Nemo-Instruct-2407-f16.gguf](https://huggingface.co/second-state/Mistral-Nemo-Instruct-2407-GGUF/blob/main/Mistral-Nemo-Instruct-2407-f16.gguf)     | f16   | 16 | 14.5 GB|  | -->
 *Quantized with llama.cpp b3405.*