XaaS Gemma 2 2B — Stage 4: Summarization + Tag Fine-Tuning

Stage 4 of 4 in the XaaS fine-tuning pipeline for Korean international trade.

A LoRA adapter (r=256) fine-tuned from the QA model (lablup/gemma-2-2b-it-xaas-qa) for summarization and topic tagging of B2B supply-chain email threads. Given a multi-turn email conversation between a Korean buyer and an overseas supplier, the model generates a Korean prose summary and a list of 5–8 Korean topic tags.

Pipeline Position

google/gemma-2-2b-it
    ↓
lablup/gemma-2-2b-it-xaas-cpt
    ↓
lablup/gemma-2-2b-it-xaas-qa
    ↓  [this model — LoRA adapter]
lablup/gemma-2-2b-it-xaas-sum-tag  ← you are here

Training Details

Parameter	Value
Base model	`lablup/gemma-2-2b-it-xaas-qa`
Method	Supervised fine-tuning (SFT) with LoRA
LoRA rank (r)	256
LoRA alpha	32
LoRA dropout	0.05
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs	~30 (1,050 steps)
Learning rate	2e-3
Max sequence length	8,192 tokens
Batch size (effective)	32 (2 per device × 16 gradient accumulation)
Optimizer	paged_adamw_8bit
Precision	bfloat16
Framework	HuggingFace TRL SFTTrainer

Training Data

lablup/tariff_trade_domain.synthetic_trade_email_sum_tag_kr (sum_tag config, Korean summaries) — 1,188 synthetic B2B supply-chain email threads paired with:

Korean prose summaries (summary)
5–8 Korean topic tags (tags, e.g. ["공급망", "협상", "건설", "계약"])

Generated by GPT-4o-mini across 20 industries and 17 conversation styles.

How to Use

This is a LoRA adapter — load it with PEFT on top of the base QA model:

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model_id = "lablup/gemma-2-2b-it-xaas-qa"
adapter_id = "lablup/gemma-2-2b-it-xaas-sum-tag"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_id)

def summarize_and_tag(conversation: str) -> str:
    prompt_text = f"대화를 요약하고 태그를 생성해줘:\n{conversation}"
    messages = [{"role": "user", "content": prompt_text}]
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
    return tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

conversation = """
보낸 사람: minji.kim@aeropack.co.kr
수신: sales@packagingworld.com
제목: 맞춤형 포장 솔루션 견적 요청
...
"""
output = summarize_and_tag(conversation)
print(output)
# tags:["협상", "계약", "포장", "공급업체", "물류"] <sep> summary:AeroPack Solutions와 공급업체 간의 ...

Parse structured output

def parse_output(raw: str) -> dict:
    tags_part, summary_part = raw.split(" <sep> ")
    import json
    tags = json.loads(tags_part.replace("tags:", "").strip())
    summary = summary_part.replace("summary:", "").strip()
    return {"tags": tags, "summary": summary}

Output Format

The model outputs tags first, then a summary, separated by <sep>:

tags:["공급망", "거래협상", "견적요청", "납품조건", "결제"] <sep> summary:한국사무용품과 Global Staples Inc. 간의 공급망 상호작용으로, 초기 문의 및 긴급성, 수량 협상, 납품 일정, 결제 조건이 논의되었으며 최종 합의에 도달하였습니다.

Merging the Adapter (optional)

To produce a standalone model without PEFT dependency:

merged = model.merge_and_unload()
merged.save_pretrained("gemma-2-2b-it-xaas-sum-tag-merged")
tokenizer.save_pretrained("gemma-2-2b-it-xaas-sum-tag-merged")

Limitations

Summaries and tags are generated for Korean-language conversations; English-only threads may work but were underrepresented in training
Tag vocabulary reflects the 20 industry categories in the training dataset
Summary length and detail level depend on conversation length; very short conversations may produce sparse summaries