Text Generation
Transformers
Safetensors
llama
4-bit precision
AWQ
Inference Endpoints
conversational
text-generation-inference
awq
Ubuntu commited on
Commit
90e8b82
1 Parent(s): 9d58ca5

adding quant config

Browse files
Files changed (1) hide show
  1. quant_config.json +6 -0
quant_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "zero_point": true,
3
+ "q_group_size": 128,
4
+ "w_bit": 4,
5
+ "version": "GEMM"
6
+ }