1. Overview

A Korean text-embedding model for the BC Card domain, built by LoRA fine-tuning Qwen/Qwen3-Embedding-0.6B on BC Card in-domain data (personal / merchant / corporate / VIP). It is intended as the retriever (bi-encoder) stage of a BC Card RAG pipeline.

On a held-out in-domain test set it improves NDCG@10 by +8.2% and Accuracy@1 by +11.3% over the base model.

This repository ships the LoRA adapter. Loading it pulls the base model (Qwen/Qwen3-Embedding-0.6B) and applies the adapter on top. For a base-free, self-contained artifact (e.g. for vLLM / TEI serving), use a merged build instead.

1.1. TL;DR

  • Base model: Qwen/Qwen3-Embedding-0.6B β€” 28 layers, hidden 1024, last-token pooling, instruction-aware
  • Domain / Language: Finance (BC Card β€” personal / merchant / corporate / VIP) / Korean
  • Task: Query-document retrieval (QA search, document similarity), RAG retriever
  • Method: PEFT (LoRA) + Multiple Negatives Ranking (contrastive)
  • Embedding dimension: 1024 Β· Max sequence length: 1024 Β· Similarity: cosine (outputs are L2-normalized)
  • Intended use
    • In-house BC Card-domain RAG retriever (Top-K candidate retrieval)
    • QA search, document-similarity scoring

1.2. Usage

Install sentence-transformers and peft (required to apply the LoRA adapter); loading also downloads the base model Qwen/Qwen3-Embedding-0.6B on first use.

pip install -U sentence-transformers peft

Queries use an instruction prompt; documents use none (matching how the model was trained).

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("BCCard/MoAI-Embedding-0.6B")

queries = ["BCμΉ΄λ“œ μ—°νšŒλΉ„λŠ” μ–΄λ–»κ²Œ λ˜λ‚˜μš”?"]
documents = [
    "BCμΉ΄λ“œ μ—°νšŒλΉ„λŠ” μΉ΄λ“œ μ’…λ₯˜μ™€ ν˜œνƒ ꡬ성에 따라 λ‹€λ₯΄κ²Œ μ±…μ •λ©λ‹ˆλ‹€ ...",
    "μΉ΄λ“œ λΆ„μ‹€ μ‹ κ³ λŠ” 고객센터 λ˜λŠ” μ•±μ—μ„œ μ¦‰μ‹œ κ°€λŠ₯ν•©λ‹ˆλ‹€ ...",
]

# `prompt_name` selects the prompt stored in the model config
q_emb = model.encode(queries, prompt_name="query")        # query instruction auto-applied
d_emb = model.encode(documents, prompt_name="document")   # document side (no instruction)

scores = model.similarity(q_emb, d_emb)   # cosine; rank documents by score
print(scores)
  • Query prompt (instruction): Instruct: Given a web search query, retrieve relevant passages that answer the query\nQuery:
  • Document prompt: none

1.3. Training Data

Dataset Role Size
BCAI-Finance-Kor-Embedding-Triplet Training (anchor / positive / negative) 43,394 triplets (train)
BCAI-Finance-Kor-Embedding-Pair Corpus pool / evaluation 36,281 unique chunks
  • Sources: BC Card financial QA (BCAI) + website crawl + synthetic data (chunking + multi-query generation)
  • Triplets are constructed via hard-negative mining over the unified corpus.

1.4. Training Procedure

Item Value
Method LoRA (PEFT)
LoRA r=64, alpha=128, dropout=0.05, targets = q,k,v,o,gate,up,down_proj
Loss CachedMultipleNegativesRankingLoss (in-batch negatives)
Batch per-device 256 (DDP) β†’ 511 in-batch negatives per rank
LR / scheduler 1e-4 / cosine, warmup_ratio 0.1, weight_decay 0.01
Epochs 3, early stopping β€” best checkpoint selected by validation NDCG@10
Precision bf16, gradient checkpointing
Hardware 6Γ— NVIDIA L40S (DDP)

2. Evaluation

2.1. Training

Trained for 3 epochs (early-stopped) with a cosine schedule; training loss decreases steadily while validation NDCG@10 climbs early and plateaus, and the best checkpoint is selected at the peak. Curves (loss / learning rate / validation NDCG@10) are logged to Weights & Biases.

Training curves - loss, learning rate, validation NDCG@10 (WandB)

2.2. In-domain Retrieval Benchmark

(1) Setup

  • Queries: 1,000 (held-out test split) Β· Corpus: 36,281 unique chunks
  • Protocol: binary-relevance information retrieval; the same evaluator used during training
  • Metrics: NDCG@10 (primary), MRR@10, Recall@{1,10}, Accuracy@1, MAP@10
  • Models compared: base (Qwen3-Embedding-0.6B, no fine-tuning) vs. v1 (r32 / lr2e-4 / 4ep) vs. v2 (r64 / lr1e-4 / 3ep, released)

(2) Test set

Test-set retrieval metrics - base vs v1 vs v2
Test-set retrieval metrics comparison (per metric)
Metric base (Qwen3-0.6B) v1 (r32/2e-4/4ep) v2 (r64/1e-4/3ep) v2 Ξ” vs base
NDCG@10 0.6186 0.6665 0.6695 +0.051 (+8.2%)
MRR@10 0.6449 0.6993 0.7060 +0.061 (+9.5%)
Recall@10 0.7046 0.7512 0.7508 +0.046 (+6.6%)
Recall@1 0.4730 0.5221 0.5293 +0.056 (+11.9%)
Accuracy@1 0.5560 0.6080 0.6190 +0.063 (+11.3%)
MAP@10 0.5652 0.6131 0.6171 +0.052 (+9.2%)

v2 is the released model (best across all metrics; Recall@10 is on par with v1). Fine-tuning lifts in-domain retrieval by roughly +10% over the base model, with the largest gains on top-rank precision (Accuracy@1, Recall@1).


2.3. Limitations

  • Domain-specific β€” tuned for BC Card Korean financial text; out-of-domain or non-Korean performance is not guaranteed.
  • Re-ranking recommended β€” as a 0.6B bi-encoder, it favors recall/throughput over fine-grained precision.
    • Recommended pipeline: Bi-Encoder (this model) Top-K β†’ Cross-Encoder re-ranking.
  • Sequence length β€” inputs are truncated at 1,024 tokens; content past that limit is not encoded, so very long documents should be chunked before indexing.
  • Exact-value matching β€” fine-grained numeric/tabular facts (fees, rates, dates, terms) are not reliably distinguished by dense similarity alone; pair with lexical (BM25) retrieval or a re-ranker when exactness matters.
  • Retrieval only β€” this is an embedding model, not a generator; it ranks passages and does not produce answers.
  • Synthetic data influence β€” part of the training set is LLM-synthesized (chunking + multi-query), which may carry the generator's stylistic/coverage biases.
  • PII β€” personal/card information was masked during preprocessing, but the model performs no PII protection at inference; apply your own masking/filtering on inputs and outputs.

3. Future Work

  • Data quality improvement & re-training
    • Human-annotation labeling
    • More rigorous hard-negative mining (iterative, mined with this model)
    • Broader/higher-quality data (incl. general financial corpora)
  • System-level
    • Cross-Encoder re-ranker for precision
    • HyDE / dynamic instruction injection at query time

4. Meta Info

4.1. Citation

@misc{bccard2026moaiembedding,
  title        = {MoAI-Embedding-0.6B: A BC Card-Domain Korean Text Embedding Model},
  author       = {BC Card},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/BCCard/MoAI-Embedding-0.6B}},
  note         = {LoRA fine-tune of Qwen3-Embedding-0.6B for BC Card-domain Korean retrieval}
}

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for BCCard/MoAI-Embedding-0.6B-LoRA

Adapter
(11)
this model

Datasets used to train BCCard/MoAI-Embedding-0.6B-LoRA