dmx-llama-3.1-8b-m6
DMX M=7 compressed version of NousResearch/Meta-Llama-3.1-8B-Instruct.
Stats
- Source: NousResearch/Meta-Llama-3.1-8B-Instruct (FP16)
- Format: DMX BFP M=7 (7 mantissa bits, block floating point)
- File size: 6.53 GB (59% smaller than FP16)
- Quality: Within GPU variance of FP16 (BF16-equivalent precision)
Usage
pip install dmx-compress dmx-runtime
from dmx_runtime import from_dmx_compressed
model = from_dmx_compressed(
"model.dmx",
model_id="NousResearch/Meta-Llama-3.1-8B-Instruct"
)
Compressed with dmx-compress.
- Downloads last month
- 3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Senat1/dmx-llama-3.1-8b-m6
Base model
NousResearch/Meta-Llama-3.1-8B-Instruct