DevQuasar
/

meta-llama.Meta-Llama-3-8B-Instruct-NVFP4A16

Text Generation

8-bit precision

compressed-tensors

Model card Files Files and versions

notes

experimental NVFP4A16 (confirmed forking with vLLM)

Please report if you find any issue with the model.

Any feedbacks are welcome!

'Make knowledge free for everyone'

Quantized version of: meta-llama/Meta-Llama-3-8B-Instruct

Downloads last month: 1

Safetensors

Model size

5B params

Tensor type

BF16

·

F32

·

F8_E4M3

·

U8

·

Model tree for DevQuasar/meta-llama.Meta-Llama-3-8B-Instruct-NVFP4A16

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Quantized

(639)

this model