YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Vietnamese Calligraphy Ideogram4 LoRA (V10 Compound Gold)

This repository contains the low-rank adaptation (LoRA) checkpoint for generating high-fidelity Vietnamese calligraphy characters and phrases.

It is fine-tuned on top of Ideogram4 (9.3B) using multi-word compound datasets, with the goal of accurate rendering for Vietnamese diacritics and brush styles.

Visual Results & Comparisons

Competitor Baseline Comparison

Below is a comparison of rendering the same phrase across different systems, showing where the fine-tuned model improves diacritic accuracy and calligraphic aesthetic preservation over Qwen Image, ERNIE Image, and commercial black-box generators:

Compound Eval28 Progress (Before vs. After SFT)

Evolution of diacritic binding during compound training epochs:

Preservation of Base Model Scene Capability

Demonstrating that the LoRA adapter retains the original model's prompt-following and high-quality background rendering capabilities when rendering calligraphy in complex scenes:

Model Details

Target Modules (6 modules):

  • attention.qkv
  • attention.o
  • feed_forward.w1
  • feed_forward.w2
  • feed_forward.w3
  • adaln_modulation

Files in this Repository

  • Performance: Achieved 97.6% accuracy (only 4 word-level errors out of 168 words on the held-out Eval28 panel).
  • step-soup_infer.safetensors: The inference-ready checkpoint. This file has its weights pre-scaled for rsLoRA inference wrapper (alpha/sqrt(rank)).
  • step-soup.safetensors: The official training checkpoint (withalpha/rank standard scale), useful for further fine-tuning or checkpoint averaging (souping).

Inference Usage

To run inference, you should load the base FP8 Ideogram4 model and inject these LoRA weights using DiffSynth-Studio.

Python Example Code

import torch
import json
from diffsynth.core import ModelConfig
from diffsynth.pipelines.ideogram4 import Ideogram4Pipeline

# 1. Define model directory paths (make sure to download FP8 Ideogram4 components)
model_dir = "models/ideogram-ai/ideogram-4-fp8"
lora_ckpt = "step-soup_infer.safetensors"  # Downloaded from this repository

# 2. Initialize Pipeline
pipe = Ideogram4Pipeline.from_pretrained(
    model_dir,
    torch_dtype=torch.bfloat16,
    device="cuda"
)

# 3. Inject LoRA weights into DiT
from hybrid_peft_ideogram4 import inject_lora_into_dit, load_lora_checkpoint
inject_lora_into_dit(
    pipe.dit,
    targets=["attention.qkv", "attention.o", "feed_forward.w1", "feed_forward.w2", "feed_forward.w3", "adaln_modulation"],
    rank=64,
    alpha=64.0
)
load_lora_checkpoint(pipe.dit, lora_ckpt)

# 4. Build prompt utilizing layout-aware no-bbox description
prompt_json = json.dumps({
    "high_level_description": 'Vietnamese calligraphy artwork of the phrase "An Khang Thịnh Vượng" in traditional brush style. The text is written in Vietnamese alphabet.',
    "style_description": {
        "art_style": "calligraphy",
        "ink_color": "black",
        "brush_style": "Traditional Vietnamese brush calligraphy, bold and elegant strokes",
        "writing_surface": "Plain white rice-paper background, no texture, no border"
    },
    "compositional_deconstruction": {
        "background": "Plain white rice-paper background, no texture, no border.",
        "elements": [{
            "type": "text",
            "text": "An Khang\nThịnh Vượng",
            "desc": "Traditional Vietnamese calligraphy characters arranged in a tidy grid of several stacked rows, multiple words per row, evenly spaced and centered, written in bold black ink brush strokes. Font: Thanh Cong Unicode.",
        }],
    },
}, ensure_ascii=False)

# 5. Run Generation
image = pipe(
    prompt=prompt_json,
    cfg_scale=7.0,
    num_inference_steps=48,
    seed=7000
)
image.save("vietnamese_calligraphy_output.png")

Citation

If you use this model or code in your research, please cite our Master's thesis:

@mastersthesis{dopt2026vietnamesecalligraphy,
  author       = {Đỗ Tuấn Phong},
  title        = {Fine-tuning Ideogram4 for Vietnamese Calligraphy Text Rendering},
  school       = {FPT University},
  address      = {Hanoi, Vietnam},
  year         = {2026},
  type         = {{M.Sc.}},
  month        = jun,
  note         = {MSE-AI program}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support