XaaS Gemma 2 2B โ€” Stage 4: Summarization + Tag Fine-Tuning

Stage 4 of 4 in the XaaS fine-tuning pipeline for Korean international trade.

A LoRA adapter (r=256) fine-tuned from the QA model (lablup/gemma-2-2b-it-xaas-qa) for summarization and topic tagging of B2B supply-chain email threads. Given a multi-turn email conversation between a Korean buyer and an overseas supplier, the model generates a Korean prose summary and a list of 5โ€“8 Korean topic tags.

Pipeline Position

google/gemma-2-2b-it
    โ†“
lablup/gemma-2-2b-it-xaas-cpt
    โ†“
lablup/gemma-2-2b-it-xaas-qa
    โ†“  [this model โ€” LoRA adapter]
lablup/gemma-2-2b-it-xaas-sum-tag  โ† you are here

Training Details

Parameter Value
Base model lablup/gemma-2-2b-it-xaas-qa
Method Supervised fine-tuning (SFT) with LoRA
LoRA rank (r) 256
LoRA alpha 32
LoRA dropout 0.05
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Epochs ~30 (1,050 steps)
Learning rate 2e-3
Max sequence length 8,192 tokens
Batch size (effective) 32 (2 per device ร— 16 gradient accumulation)
Optimizer paged_adamw_8bit
Precision bfloat16
Framework HuggingFace TRL SFTTrainer

Training Data

lablup/tariff_trade_domain.synthetic_trade_email_sum_tag_kr (sum_tag config, Korean summaries) โ€” 1,188 synthetic B2B supply-chain email threads paired with:

  • Korean prose summaries (summary)
  • 5โ€“8 Korean topic tags (tags, e.g. ["๊ณต๊ธ‰๋ง", "ํ˜‘์ƒ", "๊ฑด์„ค", "๊ณ„์•ฝ"])

Generated by GPT-4o-mini across 20 industries and 17 conversation styles.

How to Use

This is a LoRA adapter โ€” load it with PEFT on top of the base QA model:

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model_id = "lablup/gemma-2-2b-it-xaas-qa"
adapter_id = "lablup/gemma-2-2b-it-xaas-sum-tag"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, adapter_id)

def summarize_and_tag(conversation: str) -> str:
    prompt_text = f"๋Œ€ํ™”๋ฅผ ์š”์•ฝํ•˜๊ณ  ํƒœ๊ทธ๋ฅผ ์ƒ์„ฑํ•ด์ค˜:\n{conversation}"
    messages = [{"role": "user", "content": prompt_text}]
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
    return tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

conversation = """
๋ณด๋‚ธ ์‚ฌ๋žŒ: minji.kim@aeropack.co.kr
์ˆ˜์‹ : sales@packagingworld.com
์ œ๋ชฉ: ๋งž์ถคํ˜• ํฌ์žฅ ์†”๋ฃจ์…˜ ๊ฒฌ์  ์š”์ฒญ
...
"""
output = summarize_and_tag(conversation)
print(output)
# tags:["ํ˜‘์ƒ", "๊ณ„์•ฝ", "ํฌ์žฅ", "๊ณต๊ธ‰์—…์ฒด", "๋ฌผ๋ฅ˜"] <sep> summary:AeroPack Solutions์™€ ๊ณต๊ธ‰์—…์ฒด ๊ฐ„์˜ ...

Parse structured output

def parse_output(raw: str) -> dict:
    tags_part, summary_part = raw.split(" <sep> ")
    import json
    tags = json.loads(tags_part.replace("tags:", "").strip())
    summary = summary_part.replace("summary:", "").strip()
    return {"tags": tags, "summary": summary}

Output Format

The model outputs tags first, then a summary, separated by <sep>:

tags:["๊ณต๊ธ‰๋ง", "๊ฑฐ๋ž˜ํ˜‘์ƒ", "๊ฒฌ์ ์š”์ฒญ", "๋‚ฉํ’ˆ์กฐ๊ฑด", "๊ฒฐ์ œ"] <sep> summary:ํ•œ๊ตญ์‚ฌ๋ฌด์šฉํ’ˆ๊ณผ Global Staples Inc. ๊ฐ„์˜ ๊ณต๊ธ‰๋ง ์ƒํ˜ธ์ž‘์šฉ์œผ๋กœ, ์ดˆ๊ธฐ ๋ฌธ์˜ ๋ฐ ๊ธด๊ธ‰์„ฑ, ์ˆ˜๋Ÿ‰ ํ˜‘์ƒ, ๋‚ฉํ’ˆ ์ผ์ •, ๊ฒฐ์ œ ์กฐ๊ฑด์ด ๋…ผ์˜๋˜์—ˆ์œผ๋ฉฐ ์ตœ์ข… ํ•ฉ์˜์— ๋„๋‹ฌํ•˜์˜€์Šต๋‹ˆ๋‹ค.

Merging the Adapter (optional)

To produce a standalone model without PEFT dependency:

merged = model.merge_and_unload()
merged.save_pretrained("gemma-2-2b-it-xaas-sum-tag-merged")
tokenizer.save_pretrained("gemma-2-2b-it-xaas-sum-tag-merged")

Limitations

  • Summaries and tags are generated for Korean-language conversations; English-only threads may work but were underrepresented in training
  • Tag vocabulary reflects the 20 industry categories in the training dataset
  • Summary length and detail level depend on conversation length; very short conversations may produce sparse summaries

License

Built on Google Gemma 2 and subject to the Gemma Terms of Use.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for lablup/gemma-2-2b-it-xaas-sum-tag

Collection including lablup/gemma-2-2b-it-xaas-sum-tag