traintogpb
/

llama-3-enko-translator-8b-qlora-bf16-upscaled

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

traintogpb commited on May 23

Commit

e658f6d

•

1 Parent(s): 1ea767f

Update README.md

Files changed (1) hide show

README.md +56 -0

README.md CHANGED Viewed

@@ -7,5 +7,61 @@ language:
 - ko
 pipeline_tag: translation
 ---

 - ko
 pipeline_tag: translation
 ---
+### Pretrained LM
+- [beomi/Llama-3-Open-Ko-8B](https://huggingface.co/beomi/Llama-3-Open-Ko-8B) (MIT License)
+### Training Dataset
+- [traintogpb/aihub-flores-koen-integrated-sparta-mini-300k](https://huggingface.co/datasets/traintogpb/aihub-flores-koen-integrated-sparta-mini-300k)
+- Can translate in Enlgish-Korean (bi-directional)
+### Prompt
+- Template:
+  ```python
+    prompt = f"Translate this from {src_lang} to {tgt_lang}\n### {src_lang}: {src_text}\n### {tgt_lang}: "
+    >>> # src_lang can be 'English', '한국어'
+    >>> # tgt_lang can be '한국어', 'English'
+  ```
+  Mind that there is a "space (`_`)" at the end of the prompt (unpredictable first token will be popped up).
+  But if you use vLLM, it's okay to remove the final space(`_`).
+### Training
+- Trained with QLoRA
+  - PLM: NormalFloat 4-bit
+  - Adapter: BrainFloat 16-bit
+  - Adapted to all the linear layers (around 2.05%)
+- Merge adapters and upscaled in BrainFloat 16-bit precision
+### Usage (IMPORTANT)
+- Should remove the EOS token (`<|endoftext|>`, id=46332) at the end of the prompt.
+  ```python
+    # MODEL
+    model_name = 'traintogpb/llama-3-enko-translator-8b-qlora-bf16-upscaled'
+    model = AutoModelForCausalLM.from_pretrained(
+        model_name,
+        max_length=768,
+        attn_implementation='flash_attention_2',
+        torch_dtype=torch.bfloat16,
+    )
+    tokenizer = AutoTokenizer.from_pretrained(adapter_name)
+    tokenizer.pad_token_id = 128002 # eos_token_id and pad_token_id should be different
+    text = "Someday, QWER will be the greatest girl band in the world."
+    input_prompt = f"Translate this from English to 한국어.\n### English: {text}\n### 한국어:"
+    inputs = tokenizer(input_prompt, max_length=768, truncation=True, return_tensors='pt')
+    if inputs['input_ids'][0][-1] == tokenizer.eos_token_id:
+        inputs['input_ids'] = inputs['input_ids'][0][:-1].unsqueeze(dim=0)
+        inputs['attention_mask'] = inputs['attention_mask'][0][:-1].unsqueeze(dim=0)
+    outputs = model.generate(**inputs, max_length=768, eos_token_id=tokenizer.eos_token_id)
+    input_len = len(inputs['input_ids'].squeeze())
+    translation = tokenizer.decode(outputs[0][input_len:], skip_special_tokens=True)
+    print(translation)
+  ```
+### Framework versions
+- PEFT 0.8.2