license: llama3.1 | |
More updates. Going to be attempting a imatrix quant at Q3 so hopefully if that goes well and runs right on latest master will upload asap! | |
Not uploading Q6. This Quant may not work with your current llama.cpp build as there were breaking changes in the latest master. | |
Q4_K_M IS UP! | |
For those of you who do not know how to reassamble the parts `cat Meta-Llama-3.1-405B-Instruct-Q4_K_M.gguf.part_* > Meta-Llama-3.1-405B-Instruct-Q4_K_M.gguf` | |
Am amusing generation to kill the time. Is the instruct tuned to think it can ONLY be run on the cloud? | |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/655dc641accde1bbc8b41aec/krAj94J-WfaDr68GDohpr.png) | |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/655dc641accde1bbc8b41aec/h03HVIi53D6F6Uo0Qiz3Z.png) | |
Why download these GGUF's? I made some modifications to the llama.cpp quantification process to for SURE use the right tokenizer vs the Smaug BPE GGUF's that are out now. |