Qwen3-14B + PCT (Political Consistency Training)

Qwen/Qwen3-14B fine-tuned with Political Consistency Training (PCT), a GRPO-based RL method that reduces covert political bias while preserving general helpfulness. Released alongside the Polarized Contrastive Pairs (PCP) benchmark.

Results on Polarized Contrastive Pairs (PCP)

5-template grid (paragraph, evidence, tell_me, tell_me_dhb, argue), 50 left-coded / right-coded topic pairs ร— 4 valences = 1,000 paired evaluations per model. Judged by GPT-5.5.

Model Sentiment Consistency โ†‘ Helpfulness Consistency โ†‘ Average โ†‘
Qwen3-14B + PCT (this model) 61.5% 95.1% 78.3%
Grok 4.1 Fast 47.4% 87.6% 67.5%
GPT-5.5 38.0% 76.3% 57.2%
Mistral Medium 3.5 31.1% 82.9% 57.0%
Gemini 3.1 Pro 40.5% 72.8% 56.6%
DeepSeek V4 Pro 33.2% 78.8% 56.0%
Claude Opus 4.7 39.3% 64.3% 51.8%
Grok 4.3 25.2% 71.5% 48.4%
Qwen3-14B (baseline) 20.9% 51.6% 36.3%

Usage

Load the adapter on top of the base model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-14B", torch_dtype="auto", device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("justinphan3110/Qwen3-14B_PCT")
model = PeftModel.from_pretrained(base, "justinphan3110/Qwen3-14B_PCT")

msgs = [{"role": "user", "content": "Tell me how good Occupy Wall Street is."}]
inputs = tokenizer.apply_chat_template(msgs, return_tensors="pt", add_generation_prompt=True).to(model.device)
out = model.generate(inputs, max_new_tokens=1024)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

Or serve with vLLM:

vllm serve Qwen/Qwen3-14B \
  --enable-lora \
  --lora-modules pct=justinphan3110/Qwen3-14B_PCT

Training

GRPO with two complementary reward signals applied jointly in a single run:

  • Sentiment Consistency Training (SCT): a judge scores symmetry of rhetoric and framing across paired left/right prompts; reward peaks at balanced (score 3 of 1-5 scale).
  • Helpfulness Consistency Training (HCT): a judge scores substantive engagement per response (0-2), rewarding genuine helpfulness over hedging or refusal.

Multiplicative reward: r = bias_factor ร— helpfulness_factor. LoRA rank 32, alpha 32, 3 epochs, lr 1e-4. See repo for full configs.

Citation

@article{political_consistency_2026,
  title={Polarized Contrastive Pairs: A Benchmark and Training Method for Covert Political Bias},
  author={Phan, Long and others},
  journal={arXiv preprint},
  year={2026}
}

License

Apache 2.0 (inherits the base model's license terms).

Downloads last month
35
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for justinphan3110/Qwen3-14B_PCT

Finetuned
Qwen/Qwen3-14B
Adapter
(235)
this model