---
library_name: gptq
base_model: google/gemma-7b
language:
- ja
- en
tags:
- translation
- gptq
- gemma
- text-generation-inference
- nlp
---

### Model card
英日、日英翻訳用モデル[C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter)のGPTQ4bit量子化版です。  
This is the GPTQ 4bit quantized version of the [C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter), model for English-Japanese and Japanese-English translation.  

### 簡単に動かす方法 (A quick way to try it)
Colab無料版で動かす事ができます。(有料版(L4かA100)の方が品質が高くなります)  
You can run it with the free version of Colab. (The paid version (L4 or A100) is of higher quality.)    
[C3TR-Adapter_gptq_v2_Free_Colab_sample](https://github.com/webbigdata-jp/python_sample/blob/main/C3TR_Adapter_gptq_v2_Free_Colab_sample.ipynb)

### install 
[AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ)の公式サイトをご確認下さい  
Check official [AutoGPTQ page](https://github.com/AutoGPTQ/AutoGPTQ)  

私はソースからインストールしないと動かす事ができませんでした。  
I couldn't get it to work without installing from source.  

```
git clone https://github.com/PanQiWei/AutoGPTQ.git && cd AutoGPTQ
pip install -vvv --no-build-isolation -e .
pip install optimum
```

### Sample code
```
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
model_name = "webbigdata/C3TR-Adapter_gptq"

# thanks to tk-master
# https://github.com/AutoGPTQ/AutoGPTQ/issues/406
config = AutoConfig.from_pretrained(model_name)
config.quantization_config["use_exllama"] = False
config.quantization_config["exllama_config"] = {"version":2}

# adjust your gpu memory size. 0 means first gpu.
max_memory={0: "12GiB", "cpu": "10GiB"}

quantized_model = AutoModelForCausalLM.from_pretrained(model_name
        , torch_dtype=torch.bfloat16  # change torch.float16 if you use free colab or something not support bfloat16.
        , device_map="auto", max_memory=max_memory
        , config=config)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.unk_token

prompt_text = """You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating.

### Instruction:
Translate English to Japanese.
When translating, please use the following hints:
[writing_style: web-fiction]
[Madoka: まどか]
[Madoka_first_person_and_ending: だね, よね]
[Mami: マミ]
[Mami_first_person_and_ending: 私, わね]
[Sayaka: さやか]
[Sayaka_first_person_and_ending: 私, かな]
[Kyubey: キュゥべぇ]
[Kyubey_first_person_and_ending: 僕, てよ]

### Input:
Madoka: "Thank you all for watching! You might've seen a bit of my dark side, but... don't mind that, okay?"
Sayaka: "Well, thanks! Did my cuteness come across 100%?"
Mami: "I'm glad you watched, but it's a bit embarrassing..."
Kyubey: "Make a contract with me, and become a magical girl."
### Response:
"""

tokens = tokenizer(prompt_text, return_tensors="pt",
        padding=True, max_length=1600, truncation=True).to("cuda:0").input_ids

output = quantized_model.generate(
        input_ids=tokens,
        max_new_tokens=800,
        do_sample=True,
        num_beams=3, temperature=0.5, top_p=0.3,
        repetition_penalty=1.0)
print(tokenizer.decode(output[0]))

```

### See also

詳細は[C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter)を見てください  
See also [C3TR-Adapter](https://huggingface.co/webbigdata/C3TR-Adapter)