dmx-llama-3.1-8b-m6

DMX M=7 compressed version of NousResearch/Meta-Llama-3.1-8B-Instruct.

Stats

  • Source: NousResearch/Meta-Llama-3.1-8B-Instruct (FP16)
  • Format: DMX BFP M=7 (7 mantissa bits, block floating point)
  • File size: 6.53 GB (59% smaller than FP16)
  • Quality: Within GPU variance of FP16 (BF16-equivalent precision)

Usage

pip install dmx-compress dmx-runtime
from dmx_runtime import from_dmx_compressed

model = from_dmx_compressed(
    "model.dmx",
    model_id="NousResearch/Meta-Llama-3.1-8B-Instruct"
)

Compressed with dmx-compress.

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Senat1/dmx-llama-3.1-8b-m6

Finetuned
(117)
this model