Edit model card

Introduction

This model was trained to translate a sentence from English to Korean using the 486k dataset from squarelike/sharegpt_deepl_ko_translation.

Loading the Model

Use the following Python code to load the model:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "nayohan/llama3-8b-it-translation-sharegpt-en-ko"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
  model_name,
  device_map="auto",
  torch_dtype=torch.bfloat16
)

Generating Text

This model supports translation from English to Korean. To generate text, use the following Python code:

system_prompt="๋‹น์‹ ์€ ๋ฒˆ์—ญ๊ธฐ ์ž…๋‹ˆ๋‹ค. ์˜์–ด๋ฅผ ํ•œ๊ตญ์–ด๋กœ ๋ฒˆ์—ญํ•˜์„ธ์š”."
sentence = "The aerospace industry is a flower in the field of technology and science."
conversation = [{'role': 'system', 'content': system_prompt},
                {'role': 'user', 'content': sentence}]

inputs = tokenizer.apply_chat_template(
  conversation,
  tokenize=True,
  add_generation_prompt=True,
  return_tensors='pt'
).to("cuda")

outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0][len(inputs[0]):]))
# Result
# INPUT: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nActs as a translator. Translate en sentences into ko sentences in  colloquial style.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nThe aerospace industry is a flower in the field of technology and science.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n
# OUTPUT: ํ•ญ๊ณต์šฐ์ฃผ ์‚ฐ์—…์€ ๊ธฐ์ˆ ๊ณผ ๊ณผํ•™ ๋ถ„์•ผ์˜ ๊ฝƒ์ž…๋‹ˆ๋‹ค.<|eot_id|>

# INPUT:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n๋‹น์‹ ์€ ๋ฒˆ์—ญ๊ธฐ ์ž…๋‹ˆ๋‹ค. ์˜์–ด๋ฅผ ํ•œ๊ตญ์–ด๋กœ ๋ฒˆ์—ญํ•˜์„ธ์š”.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n
Technical and basic sciences are very important in terms of research. It has a significant impact on the industrial development of a country. Government policies control the research budget.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n
# OUTPUT: ๊ธฐ์ˆ  ๋ฐ ๊ธฐ์ดˆ ๊ณผํ•™์€ ์—ฐ๊ตฌ ์ธก๋ฉด์—์„œ ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ํ•œ ๊ตญ๊ฐ€์˜ ์‚ฐ์—… ๋ฐœ์ „์— ํฐ ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๋‹ค. ์ •๋ถ€ ์ •์ฑ…์— ๋”ฐ๋ผ ์—ฐ๊ตฌ ์˜ˆ์‚ฐ์ด ๊ฒฐ์ •๋ฉ๋‹ˆ๋‹ค.<|eot_id|>

Citation

@article{llama3modelcard,
        title={Llama 3 Model Card},
        author={AI@Meta},
        year={2024},
        url={https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
}

Our trainig code can be found here: [TBD]

Downloads last month
19
Safetensors
Model size
8.03B params
Tensor type
BF16
ยท
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train nayohan/llama3-8b-it-translation-sharegpt-en-ko