ChineseErrorCorrector4-4B (CSRP)

🔥 Recent Updates

Date	Update
2026-05	🎉 Paper "CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards" accepted as Oral at ACL 2026
2026-05	🚀 Released ChineseErrorCorrector4-4B, achieving new SOTA on both NACGEC and CSCD benchmarks

💡 Introduction

ChineseErrorCorrector4-4B is a high-precision Chinese Grammatical Error Correction (CGEC) and Chinese Spelling Check (CSC) model, built on the CSRP (CPT → SFT → RL) three-stage training framework.

The Problem: Over-Correction Bias

Traditional LLM-based correction systems often suffer from over-correction bias — models unnecessarily paraphrase correct text rather than leaving it untouched. CSRP resolves this by calibrating decision boundaries through a structured curriculum:

Stage	Name	Description
Phase I	Balanced Continued Pre-training (CPT)	Internalizes linguistic priors using 5.9M samples with an 8:2 mixture of general and correction-specific data
Phase II	Rationale-Augmented SFT	Distills Chain-of-Thought reasoning paths to guide the model in diagnosing error types before executing corrections
Phase III	Efficiency-Aware Policy Alignment	Uses GRPO with a novel Efficiency-Aware Reward (EAR) to penalize unnecessary edits and reward surgical precision

📊 Benchmark Results

榜单一：中文语法纠错（CGEC）— NACGEC 基准

针对原生中文及学习者文本，CSRP (4B) 斩获新 SOTA，$F_{0.5}$ 高达 50.99，显著超越此前最优专业大模型。

模型 (Scale)	准确率 Precision	召回率 Recall	$F_{0.5}$ (核心指标)
BART	34.67	41.88	35.91
HW-CGEC	50.95	32.29	45.26
ScholarGEC (14B)	45.08	59.33	47.35
CEC3 (4B)	54.20	34.75	48.74
CSRP (4B) [Ours] ✅	57.17	35.60	50.99

🔥 超越 14B 大模型： 参数量仅为三成，$F_{0.5}$ 相比 ScholarGEC-14B 提升 +3.64！

🔥 极高准确率 (Precision 57.17%)： 远超其他模型，最大程度压制了 false-positive（假阳性改写），真正做到"无错不改，有错必精"。

榜单二：中文拼写检查（CSC）— CSCD 基准

CSRP 在字符级纠错 F1 上同样展现出强劲统治力，达到惊人的 59.61，全面超越 GPT-4。

模型	Correction F1
BERT	25.49
SoftMask	44.48
SMBERT	44.67
MDCSpell+ARM	48.93
GPT-4 (Few-shot)	54.41
CSRP (4B) [Ours] ✅	59.61

🛠️ Quick Start

Requirements

pip install -U transformers torch

Note: Requires transformers >= 4.51.0 for Qwen3 architecture support.

Inference with Transformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "twnlp/ChineseErrorCorrector4-4B"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Professional instruction template
instruction = (
    "假如你是一名专业的纠错专家，请分析输入句子的语法错误类型和修改原因，"
    "并只输出纠正后的语句，错误类型如下：错别字、词语搭配错误、词性错误、"
    "语序错误、成分残缺、成分赘余、关联词使用错误、指代不明、语义逻辑不通、无误。"
)

text_input = "下个星期，我跟我朋唷打算去法国玩儿。"

messages = [
    {"role": "system", "content": instruction},
    {"role": "user",   "content": text_input}
]

text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.1
)

response = tokenizer.decode(
    generated_ids[0][len(model_inputs.input_ids[0]):],
    skip_special_tokens=True
)
print(response)

📝 Output Example

Input:

下个星期，我跟我朋唷打算去法国玩儿。

Model Output:

<think>
错误类型：错别字
修改原因：原句中的"朋唷"是错误写法，正确应为"朋友"。
"唷"是语气助词，不能用于此处指代同伴。
正确句使用"朋友"准确表达了与说话者一同前往的人，避免了因错别字造成的语义误解。
</think>

下个星期，我跟我朋友打算去法国玩儿。

Supported error types:

错误类型	说明
错别字	Typos / wrong characters
词语搭配错误	Wrong word collocation
词性错误	Wrong part of speech
语序错误	Wrong word order
成分残缺	Missing sentence components
成分赘余	Redundant components
关联词使用错误	Wrong conjunction usage
指代不明	Ambiguous reference
语义逻辑不通	Semantic/logical inconsistency
无误	No error