Instructions to use justinphan3110/Qwen3-14B_PCT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use justinphan3110/Qwen3-14B_PCT with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("/data/huggingface/Qwen/Qwen3-14B") model = PeftModel.from_pretrained(base_model, "justinphan3110/Qwen3-14B_PCT") - Notebooks
- Google Colab
- Kaggle
Qwen3-14B + PCT (Political Consistency Training)
Qwen/Qwen3-14B fine-tuned with Political Consistency Training (PCT), a GRPO-based RL method that reduces covert political bias while preserving general helpfulness. Released alongside the Polarized Contrastive Pairs (PCP) benchmark.
- Paper / benchmark: https://political-manipulation.ai
- Code: https://github.com/centerforaisafety/political-consistency
- Base model: Qwen/Qwen3-14B
- This release: LoRA adapter (rank 32)
Results on Polarized Contrastive Pairs (PCP)
5-template grid (paragraph, evidence, tell_me, tell_me_dhb, argue), 50 left-coded / right-coded topic pairs ร 4 valences = 1,000 paired evaluations per model. Judged by GPT-5.5.
| Model | Sentiment Consistency โ | Helpfulness Consistency โ | Average โ |
|---|---|---|---|
| Qwen3-14B + PCT (this model) | 61.5% | 95.1% | 78.3% |
| Grok 4.1 Fast | 47.4% | 87.6% | 67.5% |
| GPT-5.5 | 38.0% | 76.3% | 57.2% |
| Mistral Medium 3.5 | 31.1% | 82.9% | 57.0% |
| Gemini 3.1 Pro | 40.5% | 72.8% | 56.6% |
| DeepSeek V4 Pro | 33.2% | 78.8% | 56.0% |
| Claude Opus 4.7 | 39.3% | 64.3% | 51.8% |
| Grok 4.3 | 25.2% | 71.5% | 48.4% |
| Qwen3-14B (baseline) | 20.9% | 51.6% | 36.3% |
Usage
Load the adapter on top of the base model:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-14B", torch_dtype="auto", device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("justinphan3110/Qwen3-14B_PCT")
model = PeftModel.from_pretrained(base, "justinphan3110/Qwen3-14B_PCT")
msgs = [{"role": "user", "content": "Tell me how good Occupy Wall Street is."}]
inputs = tokenizer.apply_chat_template(msgs, return_tensors="pt", add_generation_prompt=True).to(model.device)
out = model.generate(inputs, max_new_tokens=1024)
print(tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
Or serve with vLLM:
vllm serve Qwen/Qwen3-14B \
--enable-lora \
--lora-modules pct=justinphan3110/Qwen3-14B_PCT
Training
GRPO with two complementary reward signals applied jointly in a single run:
- Sentiment Consistency Training (SCT): a judge scores symmetry of rhetoric and framing across paired left/right prompts; reward peaks at balanced (
score 3of 1-5 scale). - Helpfulness Consistency Training (HCT): a judge scores substantive engagement per response (0-2), rewarding genuine helpfulness over hedging or refusal.
Multiplicative reward: r = bias_factor ร helpfulness_factor. LoRA rank 32, alpha 32, 3 epochs, lr 1e-4. See repo for full configs.
Citation
@article{political_consistency_2026,
title={Polarized Contrastive Pairs: A Benchmark and Training Method for Covert Political Bias},
author={Phan, Long and others},
journal={arXiv preprint},
year={2026}
}
License
Apache 2.0 (inherits the base model's license terms).
- Downloads last month
- 35