aiplanet
/

effi-7b-awq

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

bhavyaaiplanet commited on Feb 9

Commit

e06a18a

•

1 Parent(s): 6bae5f8

Update README.md

Files changed (1) hide show

README.md +17 -2

README.md CHANGED Viewed

@@ -27,6 +27,17 @@ effi 7b AWQ is a quantized version of effi 7b whiich is a 7 billion parameter mo
 - **License:** Apache 2.0
 - **Quantized from model:** Effi-7b
 ### Example of usage
@@ -62,6 +73,10 @@ outputs = model.generate(input_ids=input_ids, max_new_tokens=512, top_p=0.9,temp
 # Print the result
-print(f"{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(template):].split(' [/INST]')[0]}")
-```

 - **License:** Apache 2.0
 - **Quantized from model:** Effi-7b
+### Qunatization Configuration
+  "zero_point": true,
+  "q_group_size": 128,
+  "w_bit": 4,
+  "version": "GEMM",
+  "modules_to_not_convert": null
 ### Example of usage
 # Print the result
+print(f"{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(template):]}")
+```
+### Framework versions
+- Transformers 4.37.2
+- Autoawq 0.1.8