Text Generation
PEFT
Safetensors
English
qwen3
qlora
lora
structured-output
conversational

qwen3-4b-structured-output-merged

This repository provides a merged full model based on Qwen/Qwen3-4B-Instruct-2507, trained with LoRA for structured output tasks.

The model was fine-tuned with LoRA and then merged into the base model weights, so it can be loaded directly without PEFT.

Training Objective

This model is optimized to improve structured output accuracy (JSON / YAML / XML / TOML / CSV).

Loss was applied only to the final assistant output, while intermediate reasoning (Chain-of-Thought) was masked during training.

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507 Method: QLoRA (4-bit) → merged after training Max sequence length: 2048 LoRA config (during training): r=64, alpha=128 Loss: applied only to the final assistant output (think tags always masked) Code fences: applied to all samples from Step 2 onward


Step 1

Learning rate: 1e-6 Dataset: daichira/structured-5k-mix-sft Notes: Initial SFT. Think tags were excluded from the loss.

Step 2

Learning rate: 3e-7 Dataset: daichira/structured-5k-mix-sft Notes: Code fences were added to all samples.

Step 3

Learning rate: 3e-7 Dataset: daichira/structured-5k-mix-sft Notes: The following tasks were upsampled 2×:

  • csv → json
  • csv → xml
  • csv → yaml

Step 4

Learning rate: 3e-7 Dataset: daichira/structured-hard-sft-4k Notes: Training on the hard dataset for robustness.

Step 5

Learning rate: 1e-6 Dataset: daichira/structured-5k-mix-sft Notes: Training focused only on:

  • csv → json
  • csv → xml

Step 6

Learning rate: 1e-6 Dataset: daichira/structured-5k-mix-sft Notes: Continued training on:

  • csv → json
  • csv → xml

Step 7

Learning rate: 5e-7 Dataset: daichira/structured-hard-sft-4k Notes: Focused training on:

  • xml → yaml (hard)

Step 8

Learning rate: 5e-7 Dataset: daichira/structured-hard-sft-4k Notes: Focused training on:

  • text → yaml (hard)
  • xml → yaml (hard)

Step 9

Learning rate: 5e-7 Dataset: daichira/structured-5k-mix-sft Notes: Focused training on:

  • csv → xml

Step 10

Learning rate: 1e-7 Dataset: daichira/structured-3k-mix-sft Notes: Final stabilization step focused on:

  • csv → xml

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "kiratan/qwen3-4b-structeval-lora-57-merged"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

Sources & Terms (IMPORTANT)

Training data:

  • daichira/structured-3k-mix-sft
  • daichira/structured-hard-sft-4k
  • daichira/structured-5k-mix-sft

Dataset License: CC-BY-4.0 License.

Compliance: Users must comply with the CC-BY-4.0 license terms for datasets and the base model's original terms of use.

Downloads last month
51
Safetensors
Model size
4B params
Tensor type
F16
·
Inference Providers NEW
Input a message to start chatting with kiratan/qwen3-4b-structeval-lora-57-merged.

Model tree for kiratan/qwen3-4b-structeval-lora-57-merged

Adapter
(5498)
this model

Datasets used to train kiratan/qwen3-4b-structeval-lora-57-merged