ddh0
/

Mistral-Large-Instruct-2407-q8_0-q8_0-GGUF

Inference Endpoints

Model card Files Files and versions Community

ddh0 commited on Aug 21

Commit

b1fff28

•

1 Parent(s): ec2dcdc

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -18,6 +18,6 @@ language:
 This is [mistralai/Mistral-Large-Instruct-2407](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407), converted to GGUF and quantized to q8_0. Both the model and the embedding/output tensors are q8_0.
-The model is split using the llama.cpp/llama-gguf-split cli utility into shards no larger than 7GB. The purpose of this is to make it less painful to resume downloading if interrupted.
 [GGUFv3](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md)

 This is [mistralai/Mistral-Large-Instruct-2407](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407), converted to GGUF and quantized to q8_0. Both the model and the embedding/output tensors are q8_0.
+The model is split using the `llama.cpp/llama-gguf-split` cli utility into shards no larger than 7GB. The purpose of this is to make it less painful to resume downloading if interrupted.
 [GGUFv3](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md)