Notice

The quantized model has been found with performance issues. Please check new uploaded model files in Dream-v0-Instruct-7B-4bit

Quantized Dream-v0-Instruct-7B

This repository contains a 4-bit quantized version of the Dream-v0-Instruct-7B model, optimized for memory-efficient inference while maintaining good performance.

Model Details

  • Base Model: Dream-v0-Instruct-7B
  • Quantization: 4-bit quantization using bitsandbytes
  • Last Updated: 2025-04-07

Quantization Configuration

The model uses the following quantization settings:

from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16"
)

Usage

Here's how to load and use the quantized model:

from transformers import AutoModel, AutoTokenizer, BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16"
)

model = AutoModel.from_pretrained(
    "Rainnighttram/Dream-7B-bnb-4bit",
    quantization_config=quantization_config,
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "Rainnighttram/Dream-7B-bnb-4bit",
    trust_remote_code=True
)

Requirements

  • transformers==4.46.2
  • bitsandbytes
  • torch==2.5.1
  • Python 3.11+
  • accelerate>=0.26.0

Original Model

This is a quantized version of Dream-org/Dream-v0-Instruct-7B. Please refer to the original model card for more details about the base model's capabilities and limitations.

License

This model inherits its license from the original Dream-v0-Instruct-7B model. Please refer to the original model repository for licensing information.

Downloads last month
56
Safetensors
Model size
4.45B params
Tensor type
FP16
F32
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Rainnighttram/Dream-7B-bnb-4bit

Quantized
(2)
this model