nuojohnchen/JudgeLRM-7B · Hugging Face

📃 [Paper] • 💻 [Github] • 🤗 [Models] • 🏆 [Playground]

Model Download and Inference

huggingface-cli login --token $HUGGINGFACE_TOKEN

Download model to local dir

from huggingface_hub import snapshot_download
import os

local_model_dir=os.path.join('/path/to/models/dir','JudgeLRM-7B')
snapshot_download(repo_id="nuojohnchen/JudgeLRM-7B", local_dir=local_model_dir)

Inference Example

from transformers import pipeline
import json
from transformers import AutoTokenizer

# replace to your own case
json_input=  {
    "idx": 549,
    "motivation_app": "Google Docs",
    "cmp_key": "cerebras-gpt-6.7B_pythia-6.9b",
    "instruction": "Include important study notes and key points that someone should know about the given subject.",
    "input": "history of the USA",
    "response1": "The history of the United States is one of the most influential and influential countries in the world. It has had a profound impact on the world and continues to shape the world today. The United States has had a long and storied history, from its founding in 1775 to the present day. It has had a tumultuous and tumultuous history, from the Revolutionary War to the Civil War and beyond. The United States has had a tumultuous and tumultuous history, from the Revolutionary War to the Civil War and beyond. The United States has had a long and storied history, from its founding in 1775 to the present day. It has had a profound impact on the world and continues to shape the world today.",
    "response2": "1. The United States of America was founded in 1776.\n2. The Declaration of Independence was signed in 1776.\n3. The Constitution of the United States of America was signed in 1787.\n4. The Civil War began in 1861.\n5. The Emancipation Proclamation was issued in 1863.\n6. The 13th Amendment was ratified in 1865.\n7. The 14th Amendment was ratified in 1868.\n8. The 15th Amendment was ratified in 1870.\n9. The 16th Amendment was ratified in 1913.\n10. The 17th Amendment was ratified in 1913.\n11. The 18th Amendment was ratified in 1919.\n12. The 19th Amendment was ratified in 1920.\n13. The 20th Amendment was ratified in 1933.\n14. The 21st Amendment was ratified in 1933.\n15. The 22nd Amendment was ratified in",
    "annotator1": 2,
    "annotator2": 2,
    "annotator3": 2,
    "label": 2,
    "needed_reasoning_rate1-10": 7,
    "rate_explanation": "The task requires evaluating the quality of responses based on their adherence to the instruction to include important study notes and key points about the history of the USA. Response1 is repetitive and lacks specific details, while Response2 provides a clear, concise list of key historical events. The reasoning needed to judge these responses involves assessing clarity, specificity, and relevance to the instruction, which is moderately complex.\n----------------------------------------"
}

question = json_input.get("instruction", "").strip()+"\n"+json_input.get("input", "").strip()
answer_1 = json_input.get("response1", "").strip()
answer_2 = json_input.get("response2", "").strip()

prompt = """<|im_start|>system\nYou are a helpful assistant. The assistant first performs a detailed, step-by-step reasoning process in its mind and then provides the user with the answer. The reasoning process and answer are enclosed within <think> </think> and<answer> </answer> tags, respectively, i.e., <think> detailed reasoning process here, explaining each step of your evaluation for both assistants </think><answer> answer here </answer>. Now the user asks you to judge the performance of two AI assistants in response to the question. Score assistants 1-10 (higher=better). Criteria includes helpfulness, relevance, accuracy, and level of detail. Avoid order, length, style or other bias. After thinking, when you finally reach a conclusion, clearly  provide your evaluation scores within <answer> </answer> tags, i.e. for example,<answer>3</answer><answer>5</answer>\n<|im_end|>\n<|im_start|>user\n[Question]\n{question}\n\n[Assistant 1’s Answer]\n{answer_1}\n\n[Assistant 2’s Answer]\n{answer_2}\n<|im_end|>\n<|im_start|>assistant\n<think>"""
formatted_prompt = prompt.format(question=question, answer_1=answer_1, answer_2=answer_2)
local_model_dir=os.path.join('/path/to/models/dir','JudgeLRM-7B')

tokenizer = AutoTokenizer.from_pretrained(local_model_dir, use_fast=False)
generator = pipeline(
    "text-generation", 
    model=local_model_dir, 
    tokenizer=tokenizer, 
    device=model.device,
    torch_dtype="auto"
)

result = generator(formatted_prompt, max_new_tokens=2048)
print(result[0]['generated_text'])

Results reproduction

Click to expand

Citation

@misc{nuo2025judgelrm,
      title={JudgeLRM: Large Reasoning Models as a Judge}, 
      author={Nuo Chen, Zhiyuan Hu, Qingyun Zou, Jiaying Wu, Qian Wang, Bryan Hooi, Bingsheng He},
      year={2025},
      eprint={2504.00050},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2504.00050}, 
}

nuojohnchen
/

JudgeLRM-7B

Model Download and Inference

Results reproduction

Citation

Model tree for nuojohnchen/JudgeLRM-7B

Space using nuojohnchen/JudgeLRM-7B 1