--- tags: - uqff - mistral.rs base_model: meta-llama/Llama-3.1-8B-Instruct base_model_relation: quantized --- # `meta-llama/Llama-3.1-8B-Instruct`, UQFF quantization Run with [mistral.rs](https://github.com/EricLBuehler/mistral.rs). Documentation: [UQFF docs](https://github.com/EricLBuehler/mistral.rs/blob/master/docs/UQFF.md). 1) **Flexible** 🌀: Multiple quantization formats in *one* file format with *one* framework to run them all. 2) **Reliable** 🔒: Compatibility ensured with *embedded* and *checked* semantic versioning information from day 1. 3) **Easy** 🤗: Download UQFF models *easily* and *quickly* from Hugging Face, or use a local file. 3) **Customizable** 🛠️: Make and publish your own UQFF files in minutes. ## Examples |Quantization type(s)|Example| |--|--| |FP8|`./mistralrs-server -i plain -m EricB/Llama-3.1-8B-Instruct-UQFF --from-uqff llama3.1-8b-instruct-f8e4m3.uqff`| |HQQ4|`./mistralrs-server -i plain -m EricB/Llama-3.1-8B-Instruct-UQFF --from-uqff llama3.1-8b-instruct-hqq4.uqff`| |HQQ8|`./mistralrs-server -i plain -m EricB/Llama-3.1-8B-Instruct-UQFF --from-uqff llama3.1-8b-instruct-hqq8.uqff`| |Q3K|`./mistralrs-server -i plain -m EricB/Llama-3.1-8B-Instruct-UQFF --from-uqff llama3.1-8b-instruct-q3k.uqff`| |Q4K|`./mistralrs-server -i plain -m EricB/Llama-3.1-8B-Instruct-UQFF --from-uqff llama3.1-8b-instruct-q4k.uqff`| |Q5K|`./mistralrs-server -i plain -m EricB/Llama-3.1-8B-Instruct-UQFF --from-uqff llama3.1-8b-instruct-q5k.uqff`| |Q8_0|`./mistralrs-server -i plain -m EricB/Llama-3.1-8B-Instruct-UQFF --from-uqff llama3.1-8b-instruct-q8_0.uqff`|