Model Card for Model ID
Model Details
Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B.
This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts.
With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer).
Sample usage
from transformers import pipeline
import torch
pipe = pipeline(
task="text-generation",
model=model,
tokenizer=tokenizer,
model_kwargs={"torch_dtype": torch.bfloat16},
truncation=True
)
def extract_response_llama3(question):
messages = [
{"role": "system", "content": ""},
{"role": "user", "content": question},
]
prompt = pipe.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipe.tokenizer.eos_token_id,
pipe.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipe(
prompt,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.1,
top_p=0.9,
num_return_sequences=1
)
return outputs[0]['generated_text'].split('\n')[-1]
question = "μμ°μ λΆλ°°ν λ μ¬μ
μ μ°μ μμλ₯Ό μ ν΄μ μ°¨λ± μ§μνλ λ°©λ²μ λλΌκ³ νμ§"
response = extract_response_llama3(question)
print(response)
question = "λ―ΈμΈλ¨Όμ§ μμ±λ¬Όμ§μ λ°°μΆμ μ κ°νκ³ μ’
ν©μ μΌλ‘ κ΄λ¦¬νκΈ° μν λ²μ μ΄λμ μ μ νλ"
response = extract_response_llama3(question)
print(response)
question = "μ΄λ€ μ₯μμ λκΈ°μ€μΌμ λ°©μ§νκΈ° μν μ μ±
μ λ²μ κ·Όκ±°κ° νΉλ³λ²μ μ μ μΌλ‘ μ€λΉλμμ§"
response = extract_response_llama3(question)
print(response)
Sample Output
μ νκ³Ό μ§μ€
νκ²½λΆ
νλ§
- Downloads last month
- 3,937
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.