meta-llama/Llama-4-Scout-17B-16E-Instruct, UQFF quantization

Run with mistral.rs. Documentation: UQFF docs.

  1. Flexible πŸŒ€: Multiple quantization formats in one file format with one framework to run them all.
  2. Reliable πŸ”’: Compatibility ensured with embedded and checked semantic versioning information from day 1.
  3. Easy πŸ€—: Download UQFF models easily and quickly from Hugging Face, or use a local file.
  4. Customizable πŸ› οΈ: Make and publish your own UQFF files in minutes.

Examples

Note: If you are using an Apple Silicon device (on Metal), prefer using an πŸ”₯ AFQ quantization for best performance!

Quantization type(s) Example
Q4K ./mistralrs-server -i vision-plain -m EricB/Llama-4-Scout-17B-16E-Instruct-UQFF -a llama4 --from-uqff "llama4-scout-instruct-q4k-0.uqff;llama4-scout-instruct-q4k-1.uqff;llama4-scout-instruct-q4k-2.uqff;llama4-scout-instruct-q4k-3.uqff;llama4-scout-instruct-q4k-4.uqff;llama4-scout-instruct-q4k-5.uqff;llama4-scout-instruct-q4k-6.uqff"
AFQ4 Coming soon!
Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for EricB/Llama-4-Scout-17B-16E-Instruct-UQFF

Collection including EricB/Llama-4-Scout-17B-16E-Instruct-UQFF