CC-KuLLM3-LoRA
This model is a fine-tuned version of nlpai-lab/KULLM3 on the AI-Hub/민원(콜센터) 질의-응답 데이터 and AI-Hub/용도별 목적대화 데이터 dataset. It achieves the following results on the evaluation set:
- Loss: 0.6711751222610474
This model is fine-tuned for "multi-turn conversations" in CC(Contact Center) domain.
Python code
import torch
from enum import Enum
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from peft import PeftModel
device = 'cuda' torch.cuda.is_available() else 'cpu'
base_model = AutoModelForCausalLM.from_pretrained(
'nlpai-lab/KULLM3',
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map=device
)
tokenizer = AutoTokenizer.from_pretrained(
'song9/CC-KuLLM3-LoRA'
)
base_model.resize_token_embeddings(len(tokenizer))
peft_model = PeftModel.from_pretrained(
base_model,
'song9/CC-KuLLM3-LoRA'
)
# inference
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
conversation = [
{'role':'user','content':"지난주에 주문한 제품 반품하고 싶은데 어떻게 하죠?"}, # 이전 문맥
{'role':'assistant','content':'반품 사유를 말씀해주시겠습니까?'}, # 이전 문맥
{'role':'user','content':'아 네 제품이 배송 중에 파손된 것 같아요.'}
]
response_template = '[/INST]'
instruction_template = '[INST]'
collator = DataCollatorForCompletionOnlyLM(
instruction_template=instruction_template,
response_template=response_template,
tokenizer=tokenizer
)
inputs = collator.tokenizer(
tokenizer.apply_chat_template(
conversation,
tokenize=False,
add_generation_prompt=True,
return_tensors='pt'
),
padding=True,
truncation=True,
return_tensors='pt'
).to(device)
output = peft_model.generate(inputs['input_ids'], streamer=streamer, max_new_tokens=1024, use_cache=True)
# print(tokenizer.decode(output[0]))
# 네 그러시군요. 고객님 불편을 드려 죄송합니다.
License
Apache-2.0 (following original repo)
Training and evaluation data
Dataset is preprocessed with two codes.
- AI-Hub/민원 질의-응답 데이터 preprocessing code : ai_hub_CC_QA_data_preprocessing_multithread_for_hf.py
- AI-Hub/용도별 목적대화 데이터 preprocessing code : ai_hub_multi_subject_conversations_data_preprocessing_multithread_for_hf.py
Set filepath and huggingface access token to yours and run the code. They refer to medium : fine-tuning-chat-based-llm.
Training procedure
Training Code is here : code
Referred this example : peft/examples
Training hyperparameters
The following hyperparameters were used during training:
- logging_steps: 800,
- save_steps: 8000,
- max_steps: 82422,
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 2
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.8004 | 0.1941 | 8000 | 0.7944 |
0.7445 | 0.3882 | 16000 | 0.7504 |
0.7058 | 0.5823 | 24000 | 0.7259 |
0.7039 | 0.7764 | 32000 | 0.7099 |
0.6861 | 0.9706 | 40000 | 0.6985 |
0.6670 | 1.1647 | 48000 | 0.6902 |
0.6562 | 1.3588 | 56000 | 0.6831 |
0.6605 | 1.5529 | 64000 | 0.6782 |
0.6474 | 1.7471 | 72000 | 0.6739 |
0.6537 | 1.9412 | 80000 | 0.6711 |
No Log | 2.0000 | 82422 | 0.6711 |
Framework versions
- PEFT 0.13.2
- Transformers 4.45.1
- Pytorch 2.3.1+cu121
- Datasets 2.21.0
- Tokenizers 0.20.0
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.