Edit model card

image/png

A model retrained by removing the last 10 layers from the original Llama-3.1-8B-Instruct model.

image/png

To retrain the knowledge held by the original language model, we conducted broad fine-tuning to revive its extensive knowledge base. Following this, we applied refined fine-tuning using high-quality datasets to enhance the model's internal and linguistic representations, thereby improving its reliability. image/png

after training the model on a specific task, we merged the pre-trained model with the task-trained model. image/png

import transformers
import torch

model_id = "kikikara/ko-llama-3.1-5b-instruct-FrankenMerging"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ์–ด ai ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค."},
    {"role": "user", "content": "์ธ์ƒ์˜ ์˜๋ฏธ๋ž€ ๋ญ์•ผ?"},
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
Downloads last month
15
Safetensors
Model size
5.85B params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.