eXTC — ICLR Reviews (paper acceptance prediction)
Anonymized artifact for a paper under double-blind review. Author identity and institution will be revealed at camera-ready.
This is the final-stage checkpoint of eXTC (eXplainable Text Classifier) for binary paper-acceptance prediction from ICLR reviewer comments. The data is drawn from the Re² peer-review corpus (ICLR main-conference rows).
- Input: the concatenated reviewer comments for a paper.
- Label:
accept(1) orreject(0) — i.e., whether the reviewer comments collectively imply the paper should be accepted. - Output: a free-text reasoning trace followed by a final
LABEL: <accept|reject>line — the reasoning serves as a local, inspectable explanation of the prediction.
eXTC pipeline
eXTC is a three-stage explainable classifier. This checkpoint is the output of all three stages:
Qwen3-4B (base)
│
├─ Stage I — SOP Learning (structured prompt optimization)
│ A natural-language rulebook (Standard Operating Procedure) is learned
│ via a structured prompt-optimization algorithm; used only to ground the
│ teacher in Stage II (not present at inference).
│
├─ Stage II — SOP-Grounded Reasoning Distillation (R-SFT)
│ Teacher: gpt-4.1-mini, prompted with <SOP, input>, rejection sampling
│ (M=4 traces/example, keep first trace whose label is correct).
│ Student: Qwen3-4B fine-tuned with LoRA (r=64, alpha=128, 2 epochs) on the
│ accepted reasoning+label traces, with class-balanced upsampling.
│
└─ Stage III — Beyond SOP via RL (BD-GRPO)
Balanced Dynamic GRPO: per-class oversampling, then drop zero-advantage
(homogeneous-rollout) groups and keep a class-balanced batch of
informative groups, with a binary label-correctness reward.
The released checkpoint is the one with the best validation macro-F1 over the RL training trajectory, evaluated on the held-out test set under that selection.
Test metrics
ICLR Reviews test set (n=655), greedy decoding (temperature=0):
| Metric | Value |
|---|---|
| Balanced accuracy | 0.8286 |
| Macro F1 | 0.8251 |
| Accuracy | 0.8412 |
| Invalid output rate | 0.000 |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
repo = "extc-anon/extc-iclr-review"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo, dtype=torch.bfloat16, device_map="auto")
prompt = (
"You are a meta-reviewer aggregating reviewer comments to classify paper acceptance.\n\n"
"Reviewer comments:\n"
"'The proposed method achieves only marginal gains over the baseline, and the "
"experimental section lacks ablations on the key components in Section 3.2. "
"Without these, it is unclear which design choices drive the improvement.'\n\n"
"Classify whether the reviewer comments collectively imply accept or reject. "
"Provide your reasoning and then the label."
)
text = tok.apply_chat_template(
[{"role": "user", "content": prompt}],
add_generation_prompt=True, tokenize=False,
)
ids = tok(text, return_tensors="pt").input_ids.to(model.device)
out = model.generate(ids, max_new_tokens=1024, do_sample=False)
print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))
Format
- Standard HuggingFace
transformers(safetensors, bfloat16, ~7.5 GB). - Architecture:
Qwen3ForCausalLM, 4.02B parameters. - Test numbers above use greedy decoding (
do_sample=False).
License
Apache 2.0 (matches the Qwen3 base model).
Citation
Anonymous paper citation will be added at camera-ready.
- Downloads last month
- 19