Model Card for Model ID
Model Details
Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B.
This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts.
With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer).
Sample usage
from transformers import pipeline
import torch
pipe = pipeline(
task="text-generation",
model=model,
tokenizer=tokenizer,
model_kwargs={"torch_dtype": torch.bfloat16},
truncation=True
)
def extract_response_llama3(question):
messages = [
{"role": "system", "content": ""},
{"role": "user", "content": question},
]
prompt = pipe.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipe.tokenizer.eos_token_id,
pipe.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipe(
prompt,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.1,
top_p=0.9,
num_return_sequences=1
)
return outputs[0]['generated_text'].split('\n')[-1]
question = "์์ฐ์ ๋ถ๋ฐฐํ ๋ ์ฌ์
์ ์ฐ์ ์์๋ฅผ ์ ํด์ ์ฐจ๋ฑ ์ง์ํ๋ ๋ฐฉ๋ฒ์ ๋ญ๋ผ๊ณ ํ์ง"
response = extract_response_llama3(question)
print(response)
question = "๋ฏธ์ธ๋จผ์ง ์์ฑ๋ฌผ์ง์ ๋ฐฐ์ถ์ ์ ๊ฐํ๊ณ ์ข
ํฉ์ ์ผ๋ก ๊ด๋ฆฌํ๊ธฐ ์ํ ๋ฒ์ ์ด๋์ ์ ์ ํ๋"
response = extract_response_llama3(question)
print(response)
question = "์ด๋ค ์ฅ์์ ๋๊ธฐ์ค์ผ์ ๋ฐฉ์งํ๊ธฐ ์ํ ์ ์ฑ
์ ๋ฒ์ ๊ทผ๊ฑฐ๊ฐ ํน๋ณ๋ฒ์ ์ ์ ์ผ๋ก ์ค๋น๋์์ง"
response = extract_response_llama3(question)
print(response)
Sample Output
์ ํ๊ณผ ์ง์ค
ํ๊ฒฝ๋ถ
ํญ๋ง
- Downloads last month
- 1,870
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.