Llama-3.2-3B-Instruct-Legal-Chatbot

Model Description

์ด ๋ชจ๋ธ์€ meta-llama/Llama-3.2-3B-Instruct๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ๊ตญ ๋ฏผ์‚ฌ๋ฒ• ๋„๋ฉ”์ธ ์ง€์‹์— ํŠนํ™”๋˜๋„๋ก Fine-tuning(LoRA)๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž์˜ ๋ฒ•๋ฅ ์  ์งˆ์˜์— ๋Œ€ํ•ด ์ „๋ฌธ์ ์ธ ๋‹ต๋ณ€์„ ์ œ๊ณตํ•˜๋ฉฐ, ๊ด€๋ จ ๋ฒ•๋ น, ํŒ๋ก€, ๊ด€ํ•  ๋ฒ•์› ๋“ฑ์˜ ๋ฒ•๋ฅ  ๊ฐœ์ฒด(Legal Entities)๋ฅผ ์ถ”์ถœํ•˜์—ฌ ํ•จ๊ป˜ ์ œ์‹œํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • Base Model: meta-llama/Llama-3.2-3B-Instruct
  • Language: Korean (ko)
  • Task: Question Answering, Text Generation, Legal Entity Extraction
  • Training Method: Supervised Fine-Tuning (SFT) with PEFT(LoRA)

Intended Use & Limitations

Intended Use

  • ํ•œ๊ตญ ๋ฏผ์‚ฌ๋ฒ• ๊ด€๋ จ ๊ธฐ์ดˆ์ ์ธ ๋ฒ•๋ฅ  ์งˆ์˜์‘๋‹ต
  • ๋ฒ•๋ฅ  ๋ฌธ์„œ ์š”์•ฝ ๋ฐ ์ฃผ์š” ๋ฒ•๋ น/ํŒ๋ก€ ๋ฒˆํ˜ธ ์ถ”์ถœ
  • ๋ฒ•๋ฅ  AI ์–ด์‹œ์Šคํ„ดํŠธ ์—ฐ๊ตฌ ๋ฐ ๊ต์œก์šฉ ๋ ˆํผ๋Ÿฐ์Šค

Limitations & Ethical Considerations

  • ์ฃผ์˜์‚ฌํ•ญ: ์ด ๋ชจ๋ธ์€ ๊ต์œก ๋ฐ ์—ฐ๊ตฌ ๋ชฉ์ ์œผ๋กœ ๊ฐœ๋ฐœ๋˜์—ˆ์œผ๋ฉฐ, ์ „๋ฌธ์ ์ธ ๋ฒ•๋ฅ  ์ƒ๋‹ด์ด๋‚˜ ๋ณ€ํ˜ธ์‚ฌ๋ฅผ ๋Œ€์ฒดํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. * LLM์˜ ํŠน์„ฑ์ƒ ํ™˜๊ฐ(Hallucination) ํ˜„์ƒ์ด ๋ฐœ์ƒํ•˜์—ฌ ์กด์žฌํ•˜์ง€ ์•Š๋Š” ๋ฒ•๋ น์ด๋‚˜ ํŒ๋ก€๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ œ ๋ฒ•์  ์กฐ์น˜๊ฐ€ ํ•„์š”ํ•œ ๊ฒฝ์šฐ ๋ฐ˜๋“œ์‹œ ๋ฒ•๋ฅ  ์ „๋ฌธ๊ฐ€(๋ณ€ํ˜ธ์‚ฌ ๋“ฑ)์˜ ์กฐ์–ธ์„ ๊ตฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

Training Details

Training Data

  • ์ถœ์ฒ˜: AI Hub (ํ•œ๊ตญ ๋ฏผ์‚ฌ๋ฒ• ์งˆ์˜์‘๋‹ต ๋ฐ ํŒ๋ก€ ๋ผ๋ฒจ๋ง ๋ฐ์ดํ„ฐ)
    • ์งˆ์˜์‘๋‹ต ์นดํ…Œ๊ณ ๋ฆฌ 75,624๊ฑด
    • rudalson/legal-qa-1k-dataset - ํ…Œ์ŠคํŠธ ์šฉ์œผ๋กœ ์ƒ˜ํ”Œ๋ง ๋ฐ์ดํ„ฐ์…‹
  • ์ „์ฒ˜๋ฆฌ: ์กฐํ•ญ ๋ฒˆํ˜ธ, ๋‚ ์งœ, ๊ธˆ์•ก ํ˜•์‹ ๋“ฑ์„ ์ •๊ทœํ™”ํ•˜์˜€์œผ๋ฉฐ, Llama 3์˜ Chat Template(system-user-assistant) ๊ตฌ์กฐ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ํ•™์Šต์„ ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.

Training Procedure & Hyperparameters

๋ชจ๋ธ ํ•™์Šต์€ Hugging Face์˜ peft, trl (SFTTrainer) ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ง„ํ–‰๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • LoRA Parameters:
    • r: 16
    • lora_alpha: 32
    • target_modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
    • lora_dropout: 0.05
  • Training Hyperparameters:
    • learning_rate: 2e-4
    • num_train_epochs: 1
    • per_device_train_batch_size: 4
    • gradient_accumulation_steps: 4
    • optimizer: adamw_torch
    • fp16 / bfloat16: Enabled

Evaluation

ํ‰๊ฐ€๋Š” ROUGE ์Šค์ฝ”์–ด ๋ฐ ๋ฒ•๋ฅ  ๊ฐœ์ฒด๋ช…(๋ฒ•๋ น, ํŒ๋ก€, ๋ฒ•์› ๋“ฑ) ์ถ”์ถœ ์ •ํ™•๋„๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ˆ˜ํ–‰๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • ROUGE-1: 0.2222
  • ROUGE-L: 0.2222

How to Get Started with the Model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# ํ—ˆ๊น…ํŽ˜์ด์Šค์— ์—…๋กœ๋“œํ•œ ๋ชจ๋ธ ๊ฒฝ๋กœ
model_name = "rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot"

# 1. ๋ชจ๋ธ ๋ฐ ํ† ํฌ๋‚˜์ด์ € ๋กœ๋“œ
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    low_cpu_mem_usage=True
)

# 2. ์ถ”๋ก  ํ…Œ์ŠคํŠธ
prompt = "๊ณ„์•ฝ ํ•ด์ œ ์‹œ ์†ํ•ด๋ฐฐ์ƒ์„ ์ฒญ๊ตฌํ•  ์ˆ˜ ์žˆ๋‚˜์š”?"
messages = [
    {"role": "system", "content": "๋‹น์‹ ์€ ํ•œ๊ตญ ๋ฒ•๋ฅ  ์ „๋ฌธ๊ฐ€ AI ์–ด์‹œ์Šคํ„ดํŠธ์ž…๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž์˜ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ์ •ํ™•ํ•˜๊ณ  ์ „๋ฌธ์ ์ธ ๋‹ต๋ณ€์„ ์ œ๊ณตํ•˜์„ธ์š”."},
    {"role": "user", "content": prompt}
]

# ์ฑ„ํŒ… ํ…œํ”Œ๋ฆฟ ์ ์šฉ
formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)

# ๋‹ต๋ณ€ ์ƒ์„ฑ
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.9,
        repetition_penalty=1.1,
        do_sample=True
    )

response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(f"์งˆ๋ฌธ: {prompt}")
print(f"๋‹ต๋ณ€: {response.strip()}")
Downloads last month
165
Safetensors
Model size
3B params
Tensor type
F32
ยท
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for rudalson/Llama-3.2-3B-Instruct-Legal-Chatbot

Finetuned
(1604)
this model