ModelCloud/TinyLlama-1.1B-Chat-v1.0-autoround-4bit

This model has been quantized using GPTQModel.

Bits: 4
Group Size: 128
Desc Act: true
Static Groups: false
Sym: true
LM Head: false
Damp Percent: 0.01
True Sequential: true
Model Name or Path:
Model File Base Name: model
Quant Method: auto_round
Checkpoint Format: gptq
Metadata
- Quantizer: gptqmodel:0.9.8-dev0
- Enable Full Range: false
- Batch Size: 1
- AMP: true
- LR Scheduler: null
- Enable Quanted Input: true
- Enable Minmax Tuning: true
- Learning Rate (LR): null
- Minmax LR: null
- Low GPU Memory Usage: true
- Iterations (Iters): 200
- Sequence Length (Seqlen): 2048
- Number of Samples (Nsamples): 512
- Sampler: rand
- Seed: 42
- Number of Blocks (Nblocks): 1
- Gradient Accumulate Steps: 1
- Not Use Best MSE: false
- Dynamic Max Gap: -1
- Data Type: int
- Scale Data Type (Scale Dtype): fp16