--- language: - en - ko license: llama3 library_name: transformers tags: - translation - enko - ko base_model: - meta-llama/Meta-Llama-3-8B-Instruct datasets: - nayohan/aihub-en-ko-translation-1.2m pipeline_tag: text-generation --- # **Introduction** The model was trained to translate a single sentence from English to Korean with a 1.18M dataset in the general domain. Dataset: [nayohan/aihub-en-ko-translation-1.2m](https://huggingface.co/datasets/nayohan/aihub-en-ko-translation-1.2m) ### **Loading the Model** Use the following Python code to load the model: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "nayohan/llama3-8b-it-translation-general-en-ko-1sent" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, device_map="auto", torch_dtype=torch.bfloat16 ) ``` ### **Generating Text** To generate text, use the following Python code: Currently, this model only support English to Korean, not other languages or reverse and styles. ```python style="written" SYSTEM_PROMPT=f"Acts as a translator. Translate en sentences into ko sentences in {style} style." s = "The aerospace industry is a flower in the field of technology and science." conversation = [{'role': 'system', 'content': SYSTEM_PROMPT}, {'role': 'user', 'content': s}] inputs = tokenizer.apply_chat_template( conversation, tokenize=True, add_generation_prompt=True, return_tensors='pt' ).to("cuda") outputs = model.generate(inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0][len(inputs[0]):])) ``` ``` # Result # INPUT: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nActs as a translator. Translate en sentences into ko sentences in colloquial style.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nThe aerospace industry is a flower in the field of technology and science.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n # OUTPUT: 항공 우주 산업은 기술과 과학의 꽃입니다.<|eot_id|> # INPUT: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nActs as a translator. Translate en sentences into ko sentences in colloquial style.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n Technical and basic sciences are very important in terms of research. It has a significant impact on the industrial development of a country. Government policies control the research budget.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n # OUTPUT: 기술과 기초과학은 연구 측면에서 매우 중요합니다. 한 국가의 산업 발전에 큰 영향을 미칩니다. 정부 정책은 연구 예산을 통제합니다.<|eot_id|> ``` ### **Citation** ```bibtex @article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url={https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md} } ``` Our trainig code can be found here: [TBD]