LnL-AI
/

TinyLlama-1.1B-Chat-v1.0-GPTQ-Marlin-4bit

Text Generation

Inference Endpoints

text-generation-inference

4-bit precision

Model card Files Files and versions Community

Qubitium commited on Mar 29

Commit

47db5ba

•

1 Parent(s): 26ea47d

Update README.md

Files changed (1) hide show

README.md +13 -11

README.md CHANGED Viewed

@@ -5,14 +5,16 @@ license: unknown
 This is TinyLlama/TinyLlama-1.1B-Chat-v1.0 quantized with AutoGPTQ in GPTQ 4-bit Marlin format.
 **Quantize config:**
-"bits": 4,
-"group_size": 128,
-"damp_percent": 0.005,
-"desc_act": false,
-"static_groups": false,
-"sym": true,
-"true_sequential": true,
-"model_name_or_path": null,
-"model_file_base_name": null,
-"checkpoint_format": "marlin",
-"quant_method": "gptq"

 This is TinyLlama/TinyLlama-1.1B-Chat-v1.0 quantized with AutoGPTQ in GPTQ 4-bit Marlin format.
 **Quantize config:**
+```
+  "bits": 4,
+  "group_size": 128,
+  "damp_percent": 0.01,
+  "desc_act": false,
+  "static_groups": false,
+  "sym": true,
+  "true_sequential": true,
+  "model_name_or_path": null,
+  "model_file_base_name": null,
+  "quant_method": "gptq",
+  "checkpoint_format": "marlin"
+```