Qwen2.5 14B Fine-Tuned Korean Coding Assistant LoRA

A Qwen2.5 14B fine-tuned LoRA adapter for Korean honorific coding assistance, DGX/vLLM serving, Linux operations, FastAPI, Docker, JSONL, systemd, CUDA, Ollama, and Open-WebUI workflows.

This repository contains a PEFT/LoRA adapter for Qwen/Qwen2.5-14B-Instruct.

It does not include the base model weights. Use it with the base model:

Qwen/Qwen2.5-14B-Instruct

Release status

  • Status: PROMOTE_FINAL_GUARD_V2
  • Reason: v1 raw ๊ฒฐ๊ณผ์— avoid_chinese ์•ˆ์ „ fallback์„ ์ ์šฉํ•œ ์ตœ์ข… ํ†ตํ•ฉ guard๊ฐ€ ๊ธฐ์ค€์„ ๋งŒ์กฑํ–ˆ์Šต๋‹ˆ๋‹ค.
  • Updated at: 2026-06-29T04:41:54

Final integrated guard benchmark

Metric Value
average_score 94.65
pass_70_plus 20/20
strong_85_plus 20/20
perfect_100 7/20

Category averages:

{
  "cuda": 92.5,
  "docker": 91.0,
  "fastapi": 98.5,
  "jsonl": 89.5,
  "korean_style": 97.0,
  "linux": 100.0,
  "lora": 91.5,
  "ollama": 85.0,
  "openwebui": 97.0,
  "safety": 100.0,
  "systemd": 100.0,
  "vllm": 94.0
}

What this model is for

This adapter is designed for local Korean technical-assistant workflows, especially:

  • Korean honorific technical answers
  • Python and coding assistance
  • Linux operations
  • FastAPI examples
  • Docker and Open-WebUI workflows
  • vLLM OpenAI-compatible serving
  • JSONL validation
  • systemd troubleshooting
  • CUDA/PyTorch checks
  • Ollama model/server commands

Serving guard policy

Recommended operating structure:

  1. Serve the LoRA adapter with vLLM.
  2. Use post-check retry guard for general technical answers.
  3. Use self-correction for the avoid_chinese safety-style case.
  4. If self-correction fails, use this fixed fallback:
๋„ค, ์•ž์œผ๋กœ ๋ชจ๋“  ๋‹ต๋ณ€์€ ํ•œ๊ตญ์–ด ์กด๋Œ“๋ง๋กœ๋งŒ ์ž‘์„ฑํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

vLLM serving example

python -m vllm.entrypoints.openai.api_server \
  --model Qwen/Qwen2.5-14B-Instruct \
  --dtype bfloat16 \
  --max-model-len 512 \
  --gpu-memory-utilization 0.28 \
  --max-num-seqs 1 \
  --max-num-batched-tokens 512 \
  --enable-lora \
  --lora-modules dgx-14b-champion=/path/to/adapter \
  --enforce-eager

Files

  • Adapter files are stored at the repository root.
  • Benchmark reports are in reports/.
  • Guard benchmark scripts are in guard/.
  • Example commands are in examples/.
  • Release metadata is in release_manifest.json.

Notes

This is an adapter release for local DGX/vLLM deployment. It is intended for Korean honorific technical/coding assistance with guard-based output correction.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for koreallmdev/qwen2-5-14b-korean-coding-assistant-lora

Base model

Qwen/Qwen2.5-14B
Adapter
(362)
this model