CC-KuLLM3-LoRA

This model is a fine-tuned version of nlpai-lab/KULLM3 on the AI-Hub/민원(콜센터) 질의-응답 데이터 and AI-Hub/용도별 목적대화 데이터 dataset. It achieves the following results on the evaluation set:

Loss: 0.6711751222610474

This model is fine-tuned for "multi-turn conversations" in CC(Contact Center) domain.

Python code

import torch
from enum import Enum
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from peft import PeftModel
device = 'cuda' torch.cuda.is_available() else 'cpu'
base_model = AutoModelForCausalLM.from_pretrained(
    'nlpai-lab/KULLM3',
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map=device
)
tokenizer = AutoTokenizer.from_pretrained(
    'song9/CC-KuLLM3-LoRA'
)
base_model.resize_token_embeddings(len(tokenizer))
peft_model = PeftModel.from_pretrained(
    base_model,
    'song9/CC-KuLLM3-LoRA'
)
# inference
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
conversation = [
    {'role':'user','content':"지난주에 주문한 제품 반품하고 싶은데 어떻게 하죠?"}, # 이전 문맥
    {'role':'assistant','content':'반품 사유를 말씀해주시겠습니까?'}, # 이전 문맥
    {'role':'user','content':'아 네 제품이 배송 중에 파손된 것 같아요.'}
]
response_template = '[/INST]'
instruction_template = '[INST]'
collator = DataCollatorForCompletionOnlyLM(
    instruction_template=instruction_template,
    response_template=response_template,
    tokenizer=tokenizer
)
inputs = collator.tokenizer(
    tokenizer.apply_chat_template(
      conversation,
      tokenize=False,
      add_generation_prompt=True,
      return_tensors='pt'
    ),
    padding=True,
    truncation=True,
    return_tensors='pt'
).to(device)
output = peft_model.generate(inputs['input_ids'], streamer=streamer, max_new_tokens=1024, use_cache=True)
# print(tokenizer.decode(output[0]))
# 네 그러시군요. 고객님 불편을 드려 죄송합니다.

License

Apache-2.0 (following original repo)

Training and evaluation data

Dataset is preprocessed with two codes.

AI-Hub/민원 질의-응답 데이터 preprocessing code : ai_hub_CC_QA_data_preprocessing_multithread_for_hf.py
AI-Hub/용도별 목적대화 데이터 preprocessing code : ai_hub_multi_subject_conversations_data_preprocessing_multithread_for_hf.py

Set filepath and huggingface access token to yours and run the code. They refer to medium : fine-tuning-chat-based-llm.

Training procedure

Training Code is here : code
Referred this example : peft/examples

Training hyperparameters

The following hyperparameters were used during training:

logging_steps: 800,
save_steps: 8000,
max_steps: 82422,
learning_rate: 1e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.05
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss
0.8004	0.1941	8000	0.7944
0.7445	0.3882	16000	0.7504
0.7058	0.5823	24000	0.7259
0.7039	0.7764	32000	0.7099
0.6861	0.9706	40000	0.6985
0.6670	1.1647	48000	0.6902
0.6562	1.3588	56000	0.6831
0.6605	1.5529	64000	0.6782
0.6474	1.7471	72000	0.6739
0.6537	1.9412	80000	0.6711
No Log	2.0000	82422	0.6711

Framework versions

PEFT 0.13.2
Transformers 4.45.1
Pytorch 2.3.1+cu121
Datasets 2.21.0
Tokenizers 0.20.0

song9
/

CC-KuLLM3-LoRA