Senat1
/

dmx-llama-3.1-8b-m6

Model card Files Files and versions

dmx-llama-3.1-8b-m6

DMX M=7 compressed version of NousResearch/Meta-Llama-3.1-8B-Instruct.

Stats

Source: NousResearch/Meta-Llama-3.1-8B-Instruct (FP16)
Format: DMX BFP M=7 (7 mantissa bits, block floating point)
File size: 6.53 GB (59% smaller than FP16)
Quality: Within GPU variance of FP16 (BF16-equivalent precision)

Usage

pip install dmx-compress dmx-runtime

from dmx_runtime import from_dmx_compressed

model = from_dmx_compressed(
    "model.dmx",
    model_id="NousResearch/Meta-Llama-3.1-8B-Instruct"
)

Compressed with dmx-compress.

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Senat1/dmx-llama-3.1-8b-m6

Base model

NousResearch/Meta-Llama-3.1-8B-Instruct

Finetuned

(117)

this model