MiniCPM4.1 Chinese Context Corrector

This is a full merged fine-tuned model based on openbmb/MiniCPM4.1-8B.

This model was created for the Hugging Face BuildSmall hackathon as the language-correction model powering the ToneBridge Space.

It was fine-tuned for Chinese context-aware sentence correction: given a short context and an imperfect Chinese sentence, the model returns a corrected sentence that better matches the context, tone, and intended meaning.

This repository contains merged model weights, not only a LoRA adapter.

Project Context

This model is part of ToneBridge, a Hugging Face Space built for the BuildSmall hackathon. ToneBridge focuses on rewriting short Chinese messages so they better match the intended tone, social context, and communication situation.

Intended Use

  • Correct short Chinese sentences according to context.
  • Adjust tone for common situations such as work messages, family messages, customer support, WeChat-style chat, school/course communication, community notices, and daily-life service messages.
  • Produce only the corrected Chinese sentence, without explanation.

Prompt Format

The model was trained with chat-style prompts similar to:

System: 你是中文语境校对助手。只输出修正后的句子,不要解释。
User:
上下文:场景:微信聊天;语气:自然友好。朋友约周末见面,语气轻松。
原句:周六上午可以见面,你来。
任务:请根据上下文修正原句。/no_think
Assistant:
周六上午可以见面,你可以过来。

Training Data

The fine-tuning dataset contains synthetic and manually reviewed Chinese correction examples. The dataset was built to reduce overly similar examples and to cover distinct contexts and tones, including polite, formal, friendly, neutral, and service-oriented messages.

Limitations

  • The model is specialized for short Chinese sentence correction, not general open-ended chat.
  • It may still over-correct, under-correct, or choose a tone that differs from the user's intended nuance.
  • For sensitive, legal, medical, or business-critical text, review the output manually.

Loading Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Alphaplasti/ToneBridge-MiniCPM4.1-8B"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    device_map="auto",
)

Base Model

This model is derived from openbmb/MiniCPM4.1-8B. Please also review and respect the base model's license and usage conditions.

Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Alphaplasti/ToneBridge-MiniCPM4.1-8B

Finetuned
(4)
this model