XaaS Gemma 2 2B — Stage 3: KIE Fine-Tuning (Production Model)

Stage 3 of 4 in the XaaS fine-tuning pipeline for Korean international trade.

Fine-tuned from the QA model (lablup/gemma-2-2b-it-xaas-qa) for Key Information Extraction (KIE) from B2B supply-chain email threads. Given a multi-turn email conversation between a Korean buyer and an overseas supplier, the model extracts structured trade information (contract terms, parties, dates, prices, delivery schedule) as YAML. This is the production merged model deployed via vLLM in the XaaS API.

Pipeline Position

google/gemma-2-2b-it
    ↓
lablup/gemma-2-2b-it-xaas-cpt
    ↓
lablup/gemma-2-2b-it-xaas-qa
    ↓  [this model]
lablup/gemma-2-2b-it-xaas-kie  ← you are here  (production)

Training Details

Parameter Value
Base model lablup/gemma-2-2b-it-xaas-qa
Method Supervised fine-tuning (SFT) with LoRA, then merged
LoRA rank (r) 256
LoRA alpha 32
LoRA dropout 0.05
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Learning rate 2e-3
Max sequence length 6,000 tokens
Batch size (effective) 64 (2 per GPU × 16 gradient accumulation × 2 nodes)
Optimizer paged_adamw_8bit
Precision bfloat16
Distributed training DeepSpeed ZeRO-3, 2 nodes
Framework HuggingFace TRL SFTTrainer + DeepSpeed

The LoRA adapter has been merged into the base weights. Load directly with AutoModelForCausalLM (no PEFT dependency required).

Training Data

lablup/tariff_trade_domain.synthetic_trade_email_kie_kr — 1,188 synthetic B2B supply-chain email threads, each paired with a structured YAML extraction of:

  • 계약 및 조건 (contract terms, payment conditions)
  • 참여자 (buyer/supplier parties)
  • 날짜 / 이벤트 (dates, key milestones)
  • 가격 / 배송 조건 (pricing, delivery schedule)

Generated by GPT-4o-mini across 20 industries (Aerospace, Technology, Manufacturing, Healthcare, ...).

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "lablup/gemma-2-2b-it-xaas-kie"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

def extract_kie(email_thread: str) -> str:
    prompt_text = (
        "다음 이메일 대화에서 계약 관련 정보를 YAML 형식으로 추출하세요.\n\n"
        f"{email_thread}"
    )
    messages = [{"role": "user", "content": prompt_text}]
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
    return tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

email = """
**Buyer Details:**
- Name: 박지훈
- Company: SkyLine Aerospace Ltd.

**Email Exchange:**
From: jihoon.park@skylineaerospace.kr
Subject: 항공용 알루미늄 부품 100개 견적 요청
...
"""
print(extract_kie(email))
# ```yaml
# 계약 및 조건:
#     결제 조건: 배송 시 결제
#     배송 일정: 주문 확인일로부터 2주 이내
# ...

OpenAI-compatible API (vLLM)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="none")
response = client.chat.completions.create(
    model="xaas-gemma-2-2b-it-lora128",
    messages=[{
        "role": "user",
        "content": "다음 이메일 대화에서 계약 관련 정보를 YAML 형식으로 추출하세요.\n\n{email_thread}"
    }],
    max_tokens=1024,
)
print(response.choices[0].message.content)

Production Deployment

Served with vLLM at --max-model-len 8128 and --tensor-parallel-size 1. Model weights are in float16, ~5 GB.

Expected Output Format

계약  조건:
    결제 조건: 선불 50%, 잔금 배송 
    배송 일정: 계약 체결  4
    보증: 12개월
참여자:
    구매자: 박지훈, SkyLine Aerospace Ltd.
    공급업체: GlobalParts Inc.
날짜:
    문의일: 2024-07-26
    예상 납기: 2024-08-23
이벤트:
    - 초기 문의  사양 확인
    - 가격 협상 (10% 대량 할인 적용)
    - 최종 계약 합의

Limitations

  • Training data is LLM-generated; extraction accuracy on real emails has not been independently verified
  • YAML schema is fixed to the training format; highly irregular email structures may produce incomplete extractions
  • Optimized for Korean-buyer / English-supplier email threads; pure Korean or pure English threads may work but were less represented in training

License

Built on Google Gemma 2 and subject to the Gemma Terms of Use.

Downloads last month
17
Safetensors
Model size
3B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lablup/gemma-2-2b-it-xaas-kie

Finetuned
(1)
this model

Collection including lablup/gemma-2-2b-it-xaas-kie