Qwen3.5-9B Humanize SFT

A LoRA adapter fine-tuned on Qwen3.5-9B for Chinese text humanization — rewriting AI-generated Chinese text to sound more like natural human writing, while preserving the original meaning and factual content.

This is the SFT foundation for the humanization model series. All DPO versions build on this adapter.

Model Details

Item Value
Base model unsloth/Qwen3.5-9B
Fine-tuning method SFT (supervised fine-tuning)
LoRA rank 16
Training data ~18k Chinese academic text pairs (CSL corpus)
Training steps 900 steps (0.8 epoch)
Final loss ~0.82

Usage

from unsloth import FastLanguageModel
from peft import PeftModel

base_model, proc = FastLanguageModel.from_pretrained(
    "unsloth/Qwen3.5-9B", max_seq_length=2048, load_in_4bit=False,
)
tokenizer = proc.tokenizer if hasattr(proc, "tokenizer") else proc

model = PeftModel.from_pretrained(
    base_model, "XiangJinYu/Qwen3.5-9B-Humanize-SFT", is_trainable=False,
)
if hasattr(model, "config") and getattr(model.config, "model_type", "") == "qwen3_5":
    model.config.model_type = "qwen3"
FastLanguageModel.for_inference(model)

instruction = "请将下面文本改写得更像自然人写作,保持原意与事实,不要加标题或说明。"
text = "本研究旨在探讨深度学习模型在自然语言处理任务中的性能优化策略。"
messages = [{"role": "user", "content": [{"type": "text", "text": f"{instruction}\n\n原文:{text}"}]}]
prompt = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True, enable_thinking=False
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.65,
                         top_p=0.9, do_sample=True, repetition_penalty=1.1)
gen = outputs[0][inputs["input_ids"].shape[1]:]
print(tokenizer.decode(gen, skip_special_tokens=True))

Training Details

  • Dataset: ~18k CSL academic paper pairs; chosen = human-written abstract, rejected = AI-rewritten version
  • Optimizer: AdamW 8-bit, lr=2e-4, cosine decay
  • Hardware: NVIDIA RTX 5090 (32GB), Unsloth + TRL

Limitations

  • Conservative on daily/casual text (improved in DPO versions)
  • Recommend using DPO versions for better naturalness

Model Series

Model Type Recommended for
This model SFT Foundation / training starting point
DPO Round 1 DPO General use, balanced
DPO Round 2 DPO Academic/technical, latest
Downloads last month
57
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for XiangJinYu/Qwen3.5-9B-Humanize-SFT

Finetuned
Qwen/Qwen3.5-9B
Adapter
(67)
this model

Collection including XiangJinYu/Qwen3.5-9B-Humanize-SFT