etemiz
/

Llama-3.1-405B-Inst-GGUF

Model card Files Files and versions Community

etemiz commited on 18 days ago

Commit

a07d189

•

1 Parent(s): 0fe9680

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -2,10 +2,10 @@
 license: llama3.1
 ---
 Llama 3.1 405B Quants and llama.cpp versions that is used for quantization
-- IQ1_S: 86.8 GB  b3459
-- IQ1_M: 95.1 GB  b3459
-- IQ2_XXS: 109.0 GB  b3459
-- IQ3_XXS: 157.7 GB  b3484
 Quantization from BF16 here:
 https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/

 license: llama3.1
 ---
 Llama 3.1 405B Quants and llama.cpp versions that is used for quantization
+- IQ1_S: 86.8 GB - b3459
+- IQ1_M: 95.1 GB - b3459
+- IQ2_XXS: 109.0 GB - b3459
+- IQ3_XXS: 157.7 GB - b3484
 Quantization from BF16 here:
 https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf/