Upload README.md
Browse files
README.md
CHANGED
@@ -39,7 +39,8 @@ Since I am a free user, so for the time being, I only upload models that might b
|
|
39 |
| Filename | Quant type | File Size | Description |
|
40 |
| -------- | ---------- | --------- | ----------- |
|
41 |
| [Llama-3_1-Nemotron-51B-Instruct.Q6_K.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q6_K.gguf) | Q6_K | 42.2GB | Good for Nvidia cards or Apple Silicon with 48GB RAM. Should perform very close to the original |
|
42 |
-
| [Llama-3_1-Nemotron-51B-Instruct.
|
|
|
43 |
| [Llama-3_1-Nemotron-51B-Instruct.Q4_0.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q4_0.gguf) | Q4_0 | 29.3GB | For 32GB cards, e.g. 5090. |
|
44 |
| [Llama-3_1-Nemotron-51B-Instruct.Q4_0_4_8.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q4_0_4_8.gguf) | Q4_0_4_8 | 29.3GB | For Apple Silicon |
|
45 |
| [Llama-3_1-Nemotron-51B-Instruct.Q3_K_S.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q3_K_S.gguf) | Q3_K_S | 22.7GB | Largest model that can fit a single 3090 |
|
|
|
39 |
| Filename | Quant type | File Size | Description |
|
40 |
| -------- | ---------- | --------- | ----------- |
|
41 |
| [Llama-3_1-Nemotron-51B-Instruct.Q6_K.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q6_K.gguf) | Q6_K | 42.2GB | Good for Nvidia cards or Apple Silicon with 48GB RAM. Should perform very close to the original |
|
42 |
+
| [Llama-3_1-Nemotron-51B-Instruct.Q5_K_M.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q5_K_M.gguf) | Q5_K_M | 36.5GB | Good for A100 40GB or dual 3090. Better than Q4_K_M but larger and slower. |
|
43 |
+
| [Llama-3_1-Nemotron-51B-Instruct.Q4_K_M.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q4_K_M.gguf) | Q4_K_M | 31GB | Good for A100 40GB or dual 3090. Higher cost performance ratio than Q5_K_M. |
|
44 |
| [Llama-3_1-Nemotron-51B-Instruct.Q4_0.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q4_0.gguf) | Q4_0 | 29.3GB | For 32GB cards, e.g. 5090. |
|
45 |
| [Llama-3_1-Nemotron-51B-Instruct.Q4_0_4_8.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q4_0_4_8.gguf) | Q4_0_4_8 | 29.3GB | For Apple Silicon |
|
46 |
| [Llama-3_1-Nemotron-51B-Instruct.Q3_K_S.gguf](https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.Q3_K_S.gguf) | Q3_K_S | 22.7GB | Largest model that can fit a single 3090 |
|