Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This model has been quantized using GPTQModel.

  • Bits: 4
  • Group Size: 128
  • Desc Act: true
  • Static Groups: false
  • Sym: true
  • LM Head: false
  • Damp Percent: 0.01
  • True Sequential: true
  • Model Name or Path:
  • Model File Base Name: model
  • Quant Method: auto_round
  • Checkpoint Format: gptq
  • Metadata
    • Quantizer: gptqmodel:0.9.8-dev0
    • Enable Full Range: false
    • Batch Size: 1
    • AMP: true
    • LR Scheduler: null
    • Enable Quanted Input: true
    • Enable Minmax Tuning: true
    • Learning Rate (LR): null
    • Minmax LR: null
    • Low GPU Memory Usage: true
    • Iterations (Iters): 200
    • Sequence Length (Seqlen): 2048
    • Number of Samples (Nsamples): 512
    • Sampler: rand
    • Seed: 42
    • Number of Blocks (Nblocks): 1
    • Gradient Accumulate Steps: 1
    • Not Use Best MSE: false
    • Dynamic Max Gap: -1
    • Data Type: int
    • Scale Data Type (Scale Dtype): fp16
Downloads last month
1
Safetensors
Model size
261M params
Tensor type
I32
BF16
FP16
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.