ChineseErrorCorrector4-4B (CSRP)

GitHub   Hugging Face   ACL 2026 Oral   License


🔥 Recent Updates

Date Update
2026-05 🎉 Paper "CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards" accepted as Oral at ACL 2026
2026-05 🚀 Released ChineseErrorCorrector4-4B, achieving new SOTA on both NACGEC and CSCD benchmarks

💡 Introduction

ChineseErrorCorrector4-4B is a high-precision Chinese Grammatical Error Correction (CGEC) and Chinese Spelling Check (CSC) model, built on the CSRP (CPT → SFT → RL) three-stage training framework.

The Problem: Over-Correction Bias

Traditional LLM-based correction systems often suffer from over-correction bias — models unnecessarily paraphrase correct text rather than leaving it untouched. CSRP resolves this by calibrating decision boundaries through a structured curriculum:

Stage Name Description
Phase I Balanced Continued Pre-training (CPT) Internalizes linguistic priors using 5.9M samples with an 8:2 mixture of general and correction-specific data
Phase II Rationale-Augmented SFT Distills Chain-of-Thought reasoning paths to guide the model in diagnosing error types before executing corrections
Phase III Efficiency-Aware Policy Alignment Uses GRPO with a novel Efficiency-Aware Reward (EAR) to penalize unnecessary edits and reward surgical precision

📊 Benchmark Results

榜单一:中文语法纠错(CGEC)— NACGEC 基准

针对原生中文及学习者文本,CSRP (4B) 斩获新 SOTA,$F_{0.5}$ 高达 50.99,显著超越此前最优专业大模型。

模型 (Scale) 准确率 Precision 召回率 Recall $F_{0.5}$ (核心指标)
BART 34.67 41.88 35.91
HW-CGEC 50.95 32.29 45.26
ScholarGEC (14B) 45.08 59.33 47.35
CEC3 (4B) 54.20 34.75 48.74
CSRP (4B) [Ours] 57.17 35.60 50.99

🔥 超越 14B 大模型: 参数量仅为三成,$F_{0.5}$ 相比 ScholarGEC-14B 提升 +3.64

🔥 极高准确率 (Precision 57.17%): 远超其他模型,最大程度压制了 false-positive(假阳性改写),真正做到"无错不改,有错必精"。


榜单二:中文拼写检查(CSC)— CSCD 基准

CSRP 在字符级纠错 F1 上同样展现出强劲统治力,达到惊人的 59.61,全面超越 GPT-4。

模型 Correction F1
BERT 25.49
SoftMask 44.48
SMBERT 44.67
MDCSpell+ARM 48.93
GPT-4 (Few-shot) 54.41
CSRP (4B) [Ours] 59.61

🛠️ Quick Start

Requirements

pip install -U transformers torch

Note: Requires transformers >= 4.51.0 for Qwen3 architecture support.

Inference with Transformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "twnlp/ChineseErrorCorrector4-4B"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Professional instruction template
instruction = (
    "假如你是一名专业的纠错专家,请分析输入句子的语法错误类型和修改原因,"
    "并只输出纠正后的语句,错误类型如下:错别字、词语搭配错误、词性错误、"
    "语序错误、成分残缺、成分赘余、关联词使用错误、指代不明、语义逻辑不通、无误。"
)

text_input = "下个星期,我跟我朋唷打算去法国玩儿。"

messages = [
    {"role": "system", "content": instruction},
    {"role": "user",   "content": text_input}
]

text = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.1
)

response = tokenizer.decode(
    generated_ids[0][len(model_inputs.input_ids[0]):],
    skip_special_tokens=True
)
print(response)

📝 Output Example

Input:

下个星期,我跟我朋唷打算去法国玩儿。

Model Output:

<think>
错误类型:错别字
修改原因:原句中的"朋唷"是错误写法,正确应为"朋友"。
"唷"是语气助词,不能用于此处指代同伴。
正确句使用"朋友"准确表达了与说话者一同前往的人,避免了因错别字造成的语义误解。
</think>

下个星期,我跟我朋友打算去法国玩儿。

Supported error types:

错误类型 说明
错别字 Typos / wrong characters
词语搭配错误 Wrong word collocation
词性错误 Wrong part of speech
语序错误 Wrong word order
成分残缺 Missing sentence components
成分赘余 Redundant components
关联词使用错误 Wrong conjunction usage
指代不明 Ambiguous reference
语义逻辑不通 Semantic/logical inconsistency
无误 No error


📜 License

This project is released under the Apache 2.0 License.

Downloads last month
157
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for twnlp/ChineseErrorCorrector4-4B

Finetuned
Qwen/Qwen3-4B
Finetuned
(664)
this model
Quantizations
2 models

Evaluation results