Qwen3-14B-PragReST
minchaoh2002/Qwen3-14B-PragReST is a Qwen3-14B model trained with PragReST: Pragmatic Reasoning via Self-Training.
PragReST is a self-supervised framework for improving pragmatic language understanding. It trains models to reason about implied meaning, speaker intent, implicature, presupposition, metonymy, social context, and other cases where the intended meaning is not fully explicit in the surface text.
This model is associated with the paper:
PragReST: Self-Reinforcing Counterfactual Reasoning for Pragmatic Language Understanding Jihyung Park*, Minchao Huang*, Leqi Liu, Elias Stengel-Eskin The University of Texas at Austin arXiv: https://arxiv.org/abs/2606.18624 Code: https://github.com/jihyung803/PragReST
- Equal contribution.
Model Details
- Model name:
minchaoh2002/Qwen3-14B-PragReST - Base model:
Qwen/Qwen3-14B - Model type: Causal language model
- Training framework: PragReST
- Training methods: supervised fine-tuning with counterfactual bootstrapping, followed by GRPO reinforcement learning
- Primary capability: pragmatic language understanding and counterfactual pragmatic reasoning
- Language: English
- Paper: https://arxiv.org/abs/2606.18624
- Code: https://github.com/jihyung803/PragReST
Quickstart
Install the latest transformers version. Qwen3 models require recent Transformers support.
pip install -U transformers accelerate torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "minchaoh2002/Qwen3-14B-PragReST"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
)
prompt = """Ken asks Mary, "Do you want tea with milk or sugar?"
Mary replies, "In a cup."
What is Mary likely implying? Explain the pragmatic reasoning."""
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True,
)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=2048,
temperature=0.6,
top_p=0.95,
top_k=20,
do_sample=True,
)
generated = outputs[0][len(inputs.input_ids[0]):]
print(tokenizer.decode(generated, skip_special_tokens=True))
Evaluation
In the paper, PragReST is evaluated on four pragmatic reasoning benchmarks:
- PragMega: fine-grained pragmatic QA
- Ludwig: implicature interpretation
- MetoQA: metonymic reference resolution
- AltPrag: open-ended pragmatic recovery
Reported Qwen3-14B results:
| Model | PragMega | Ludwig | MetoQA | AltPrag |
|---|---|---|---|---|
| Qwen3-14B Instruct | 81.66 | 82.67 | 71.77 | 7.78 |
| PragReST-SFT | 85.21 | 85.17 | 79.65 | 8.06 |
| PragReST-GRPO | 85.80 | 86.50 | 80.31 | 8.14 |
Citation
If you use this model, please cite:
@article{park2026pragrest,
title={PragReST: Self-Reinforcing Counterfactual Reasoning for Pragmatic Language Understanding},
author={Park, Jihyung and Huang, Minchao and Liu, Leqi and Stengel-Eskin, Elias},
year={2026},
journal={arXiv preprint arXiv:2606.18624},
url={https://arxiv.org/abs/2606.18624},
}
You may also cite the Qwen3 technical report for the base model:
@misc{qwen3technicalreport,
title={Qwen3 Technical Report},
author={Qwen Team},
year={2025},
eprint={2505.09388},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.09388},
}
License
This model is released under the Apache 2.0 license, following the base Qwen/Qwen3-14B model license.
Please also refer to the licenses and terms of the original datasets, benchmarks, and artifacts used in the PragReST paper.
- Downloads last month
- 120