eXTC — ICLR Reviews (paper acceptance prediction)

Anonymized artifact for a paper under double-blind review. Author identity and institution will be revealed at camera-ready.

This is the final-stage checkpoint of eXTC (eXplainable Text Classifier) for binary paper-acceptance prediction from ICLR reviewer comments. The data is drawn from the Re² peer-review corpus (ICLR main-conference rows).

Input: the concatenated reviewer comments for a paper.
Label: accept (1) or reject (0) — i.e., whether the reviewer comments collectively imply the paper should be accepted.
Output: a free-text reasoning trace followed by a final LABEL: <accept|reject> line — the reasoning serves as a local, inspectable explanation of the prediction.

eXTC pipeline

eXTC is a three-stage explainable classifier. This checkpoint is the output of all three stages:

Qwen3-4B (base)
  │
  ├─ Stage I — SOP Learning (structured prompt optimization)
  │     A natural-language rulebook (Standard Operating Procedure) is learned
  │     via a structured prompt-optimization algorithm; used only to ground the
  │     teacher in Stage II (not present at inference).
  │
  ├─ Stage II — SOP-Grounded Reasoning Distillation (R-SFT)
  │     Teacher: gpt-4.1-mini, prompted with <SOP, input>, rejection sampling
  │     (M=4 traces/example, keep first trace whose label is correct).
  │     Student: Qwen3-4B fine-tuned with LoRA (r=64, alpha=128, 2 epochs) on the
  │     accepted reasoning+label traces, with class-balanced upsampling.
  │
  └─ Stage III — Beyond SOP via RL (BD-GRPO)
        Balanced Dynamic GRPO: per-class oversampling, then drop zero-advantage
        (homogeneous-rollout) groups and keep a class-balanced batch of
        informative groups, with a binary label-correctness reward.

The released checkpoint is the one with the best validation macro-F1 over the RL training trajectory, evaluated on the held-out test set under that selection.

Test metrics

ICLR Reviews test set (n=655), greedy decoding (temperature=0):

Metric	Value
Balanced accuracy	0.8286
Macro F1	0.8251
Accuracy	0.8412
Invalid output rate	0.000

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

repo = "extc-anon/extc-iclr-review"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo, dtype=torch.bfloat16, device_map="auto")

prompt = (
    "You are a meta-reviewer aggregating reviewer comments to classify paper acceptance.\n\n"
    "Reviewer comments:\n"
    "'The proposed method achieves only marginal gains over the baseline, and the "
    "experimental section lacks ablations on the key components in Section 3.2. "
    "Without these, it is unclear which design choices drive the improvement.'\n\n"
    "Classify whether the reviewer comments collectively imply accept or reject. "
    "Provide your reasoning and then the label."
)
text = tok.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True, tokenize=False,
)
ids = tok(text, return_tensors="pt").input_ids.to(model.device)
out = model.generate(ids, max_new_tokens=1024, do_sample=False)
print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))

Format

Standard HuggingFace transformers (safetensors, bfloat16, ~7.5 GB).
Architecture: Qwen3ForCausalLM, 4.02B parameters.
Test numbers above use greedy decoding (do_sample=False).

License

Apache 2.0 (matches the Qwen3 base model).

Citation

Anonymous paper citation will be added at camera-ready.

Downloads last month: 19

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for extc-anon/extc-iclr-review

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Finetuned

(706)

this model