Text Generation
Safetensors
Vietnamese
English
vietnamese
english
customer-support
instruction-tuning
lora
unsloth
conversational
Instructions to use vochris/viet-customer-support-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- Unsloth Studio
How to use vochris/viet-customer-support-model with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for vochris/viet-customer-support-model to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for vochris/viet-customer-support-model to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for vochris/viet-customer-support-model to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="vochris/viet-customer-support-model", max_seq_length=2048, )
qwen2.5-3b-viet-customer-support-lora
Vietnamese-first bilingual customer support LoRA adapter fine-tuned from Qwen2.5-3B-Instruct with Unsloth.
What this model is for
- Vietnamese customer support conversations
- English fallback responses
- Polite, concise, next-step-oriented support messaging
Base model
Qwen/Qwen2.5-3B-Instruct
Training data (high-level)
- OPUS-100 EN↔VI parallel pairs (filtered)
- Synthetic customer support instruction examples (order status, refunds, shipping delays, account issues, billing)
- Final split: 176k train / 4k eval
Training setup
- Framework: Unsloth + TRL SFTTrainer
- Precision: bf16
- Quantization for training: 4-bit base model
- LoRA: r=32, alpha=64, dropout=0.0
- Max seq len: 768
Prompt format
Instruction:
{instruction}
User:
{input}
Assistant:
Quick usage (PEFT)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen2.5-3B-Instruct"
adapter = "REPLACE_WITH_HF_REPO"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
prompt = """Instruction:
Bạn là nhân viên chăm sóc khách hàng tiếng Việt. Trả lời lịch sự, đồng cảm, và nêu bước tiếp theo rõ ràng.
User:
Xin chào, đơn hàng của tôi bị trễ 5 ngày. Mã đơn #A12345.
Assistant:
"""
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=180, temperature=0.3, do_sample=True)
print(tok.decode(out[0], skip_special_tokens=True))
Limitations
- Not a legal/compliance authority
- Can still hallucinate policy details if product policy is ambiguous
- Should be paired with retrieval or hard policy checks in production
Recommended production guardrails
- Ground responses on your real policy KB
- Enforce redaction/PII handling
- Add escalation rules for sensitive requests