πŸ“¦ Meta-Llama-3-70B-Instruct-4bit-gguf

meta-llama/Meta-Llama-3-70B-Instruct converted to GUFF format

QuantLLM Format

⭐ Star QuantLLM on GitHub


πŸ“– About This Model

This model is meta-llama/Meta-Llama-3-70B-Instruct converted to GUFF format.

Property Value
Base Model meta-llama/Meta-Llama-3-70B-Instruct
Format GUFF
Quantization None (Full Precision)
License apache-2.0
Created With QuantLLM

πŸš€ Quick Start

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf")
tokenizer = AutoTokenizer.from_pretrained("QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf")

# Generate text
inputs = tokenizer("Once upon a time", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With QuantLLM

from quantllm import TurboModel

# Load with automatic optimization
model = TurboModel.from_pretrained("QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf")

# Generate
response = model.generate("Write a poem about coding")
print(response)

Requirements

pip install transformers torch

πŸ“Š Model Details

Property Value
Original Model meta-llama/Meta-Llama-3-70B-Instruct
Format GUFF
Quantization Full Precision
License apache-2.0
Export Date 2026-04-24
Exported By QuantLLM v2.0

πŸš€ Created with QuantLLM

QuantLLM

Convert any model to GGUF, ONNX, or MLX in one line!

from quantllm import turbo

# Load any HuggingFace model
model = turbo("meta-llama/Meta-Llama-3-70B-Instruct")

# Export to any format
model.export("guff", quantization="Q4_K_M")

# Push to HuggingFace
model.push("your-repo", format="guff")
GitHub Stars

πŸ“š Documentation Β· πŸ› Report Issue Β· πŸ’‘ Request Feature

Downloads last month
5
Safetensors
Model size
71B params
Tensor type
F32
Β·
BF16
Β·
U8
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf

Quantized
(44)
this model