webbigdata
/

C3TR-Adapter_gptq

text-generation-inference

4-bit precision

Model card Files Files and versions Community

dahara1 commited on May 20

Commit

5f482cd

•

1 Parent(s): a123405

Update README.md

Files changed (1) hide show

README.md +52 -0

README.md CHANGED Viewed

@@ -30,6 +30,58 @@ pip install -vvv --no-build-isolation -e .
 ### Sample code
 ```
 ```

 ### Sample code
 ```
+from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
+from optimum.gptq import GPTQQuantizer, load_quantized_model
+import torch
+model_name = "webbigdata/C3TR-Adapter_gptq"
+# thanks to tk-master
+# https://github.com/AutoGPTQ/AutoGPTQ/issues/406
+config = AutoConfig.from_pretrained(model_name)
+config.quantization_config["use_exllama"] = False
+config.quantization_config["exllama_config"] = {"version":2}
+max_memory={0: "12GiB", "cpu": "10GiB"}
+quantized_model = AutoModelForCausalLM.from_pretrained(model_name
+        , torch_dtype=torch.bfloat16  # chage float16 if you use free colab or something not support bfloat16.
+        , device_map="auto", max_memory=max_memory
+        , config=config)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+tokenizer.pad_token = tokenizer.unk_token
+prompt_text = """You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating.
+### Instruction:
+Translate English to Japanese.
+When translating, please use the following hints:
+[writing_style: web-fiction]
+[Madoka: まどか]
+[Madoka_first_person_and_ending: だね, よね]
+[Mami: マミ]
+[Mami_first_person_and_ending: 私, わね]
+[Sayaka: さやか]
+[Sayaka_first_person_and_ending: 私, かな]
+[Kyubey: キュゥべぇ]
+[Kyubey_first_person_and_ending: 僕, てよ]
+### Input:
+Madoka: "Thank you all for watching! You might've seen a bit of my dark side, but... don't mind that, okay?"
+Sayaka: "Well, thanks! Did my cuteness come across 100%?"
+Mami: "I'm glad you watched, but it's a bit embarrassing..."
+Kyubey: "Make a contract with me, and become a magical girl."
+### Response:
+"""
+tokens = tokenizer(prompt_text, return_tensors="pt",
+        padding=True, max_length=1600, truncation=True).to("cuda:0").input_ids
+output = quantized_model.generate(
+        input_ids=tokens,
+        max_new_tokens=800,
+        do_sample=True,
+        num_beams=3, temperature=0.5, top_p=0.3,
+        repetition_penalty=1.0)
+print(tokenizer.decode(output[0]))
 ```