--- license: apache-2.0 datasets: - squarelike/sharegpt_deepl_ko_translation language: - en - ko pipeline_tag: translation --- # Gugugo-koen-7B-V1.1 Detail repo: [https://github.com/jwj7140/Gugugo](https://github.com/jwj7140/Gugugo) ![Gugugo](./logo.png) **Base Model**: [Llama-2-ko-7b](https://huggingface.co/beomi/llama-2-ko-7b) **Training Dataset**: [sharegpt_deepl_ko_translation](https://huggingface.co/datasets/squarelike/sharegpt_deepl_ko_translation). I trained with 1x A6000 GPUs for 90 hours. ## **Prompt Template** **KO->EN** ``` ### 한국어: {sentence} ### 영어: ``` **EN->KO** ``` ### 영어: {sentence} ### 한국어: ``` There are GPTQ, AWQ, and GGUF support. [https://huggingface.co/squarelike/Gugugo-koen-7B-V1.1-GPTQ](https://huggingface.co/squarelike/Gugugo-koen-7B-V1.1-GPTQ) [https://huggingface.co/squarelike/Gugugo-koen-7B-V1.1-AWQ](https://huggingface.co/squarelike/Gugugo-koen-7B-V1.1-AWQ) [https://huggingface.co/squarelike/Gugugo-koen-7B-V1.1-GGUF](https://huggingface.co/squarelike/Gugugo-koen-7B-V1.1-GGUF) ## **Implementation Code** ```python from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria, StoppingCriteriaList import torch repo = "squarelike/Gugugo-koen-7B-V1.1" model = AutoModelForCausalLM.from_pretrained( repo, load_in_4bit=True device_map='auto' ) tokenizer = AutoTokenizer.from_pretrained(repo) class StoppingCriteriaSub(StoppingCriteria): def __init__(self, stops = [], encounters=1): super().__init__() self.stops = [stop for stop in stops] def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor): for stop in self.stops: if torch.all((stop == input_ids[0][-len(stop):])).item(): return True return False stop_words_ids = torch.tensor([[829, 45107, 29958], [1533, 45107, 29958], [829, 45107, 29958], [21106, 45107, 29958]]).to("cuda") stopping_criteria = StoppingCriteriaList([StoppingCriteriaSub(stops=stop_words_ids)]) def gen(lan="en", x=""): if (lan == "ko"): prompt = f"### 한국어: {x}\n### 영어:" else: prompt = f"### 영어: {x}\n### 한국어:" gened = model.generate( **tokenizer( prompt, return_tensors='pt', return_token_type_ids=False ).to("cuda"), max_new_tokens=2000, temperature=0.3, # no_repeat_ngram_size=5, num_beams=5, stopping_criteria=stopping_criteria ) return tokenizer.decode(gened[0][1:]).replace(prompt+" ", "").replace("", "") print(gen(lan="en", x="Hello, world!")) ```