Safetensors
qwen3

Qwen3-8B-PragReST

minchaoh2002/Qwen3-8B-PragReST is a Qwen3-8B model trained with PragReST: Pragmatic Reasoning via Self-Training.

PragReST is a self-supervised framework for improving pragmatic language understanding. It trains models to reason about implied meaning, speaker intent, implicature, presupposition, metonymy, social context, and other cases where the intended meaning is not fully explicit in the surface text.

This model is associated with the paper:

PragReST: Self-Reinforcing Counterfactual Reasoning for Pragmatic Language Understanding Jihyung Park*, Minchao Huang*, Leqi Liu, Elias Stengel-Eskin The University of Texas at Austin arXiv: https://arxiv.org/abs/2606.18624 Code: https://github.com/jihyung803/PragReST

  • Equal contribution.

Model Details

  • Model name: minchaoh2002/Qwen3-8B-PragReST
  • Base model: Qwen/Qwen3-8B
  • Model type: Causal language model
  • Training framework: PragReST
  • Training methods: supervised fine-tuning with counterfactual bootstrapping, followed by GRPO reinforcement learning
  • Primary capability: pragmatic language understanding and counterfactual pragmatic reasoning
  • Language: English
  • Paper: https://arxiv.org/abs/2606.18624
  • Code: https://github.com/jihyung803/PragReST

Quickstart

Install the latest transformers version. Qwen3 models require recent Transformers support.

pip install -U transformers accelerate torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "minchaoh2002/Qwen3-8B-PragReST"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
)

prompt = """Ken asks Mary, "Do you want tea with milk or sugar?"
Mary replies, "In a cup."

What is Mary likely implying? Explain the pragmatic reasoning."""

messages = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True,
)

inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=2048,
    temperature=0.6,
    top_p=0.95,
    top_k=20,
    do_sample=True,
)

generated = outputs[0][len(inputs.input_ids[0]):]
print(tokenizer.decode(generated, skip_special_tokens=True))

Evaluation

In the paper, PragReST is evaluated on four pragmatic reasoning benchmarks:

  • PragMega: fine-grained pragmatic QA
  • Ludwig: implicature interpretation
  • MetoQA: metonymic reference resolution
  • AltPrag: open-ended pragmatic recovery

Reported Qwen3-8B results:

Model PragMega Ludwig MetoQA AltPrag
Qwen3-8B Instruct 73.37 80.33 73.52 7.24
PragReST-SFT 77.51 82.17 78.56 7.46
PragReST-GRPO 79.29 83.33 80.72 7.62

Citation

If you use this model, please cite:

@article{park2026pragrest,
  title={PragReST: Self-Reinforcing Counterfactual Reasoning for Pragmatic Language Understanding},
  author={Park, Jihyung and Huang, Minchao and Liu, Leqi and Stengel-Eskin, Elias},
  year={2026},
  journal={arXiv preprint arXiv:2606.18624},
  url={https://arxiv.org/abs/2606.18624},
}

You may also cite the Qwen3 technical report for the base model:

@misc{qwen3technicalreport,
  title={Qwen3 Technical Report},
  author={Qwen Team},
  year={2025},
  eprint={2505.09388},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2505.09388},
}

License

This model is released under the Apache 2.0 license, following the base Qwen/Qwen3-8B model license.

Please also refer to the licenses and terms of the original datasets, benchmarks, and artifacts used in the PragReST paper.

Downloads last month
84
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for minchaoh2002/Qwen3-8B-PragReST

Finetuned
Qwen/Qwen3-8B
Finetuned
(1725)
this model
Quantizations
2 models

Collection including minchaoh2002/Qwen3-8B-PragReST

Papers for minchaoh2002/Qwen3-8B-PragReST