KnutJaegersberg
/

Galactica-120B-GPTQ-2-bit-64g

Text Generation

Inference Endpoints

Model card Files Files and versions Community

KnutJaegersberg commited on Aug 13, 2023

Commit

aa0ac4c

•

1 Parent(s): 42c45e1

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -7,6 +7,8 @@ Experimental quantization.
 Working inference code (regular inference with autogptq does not work without return_token_type_ids=False, didn't get it to work with textgen-webui):
 from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
 tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)

 Working inference code (regular inference with autogptq does not work without return_token_type_ids=False, didn't get it to work with textgen-webui):
 from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
+from transformers import AutoTokenizer, TextGenerationPipeline
 tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)