Multi-Domain Reward Model Mistral-7B-Instruct

This is a multi-domain reward model built from weqweasdas/RM-Mistral-7B. It combines 23 fine-grained regression objectives across coherence, commonsense, empathy, and multicultural response quality with a prompt-conditioned gating network that produces a single preference score.

The checkpoint was packaged with the custom RewardModelWithGating architecture used in the Multi-Domain Reward Model project.

Intended Use

Use this model to score and compare assistant responses when the evaluation should account for multiple quality dimensions rather than a single generic helpfulness score. The primary use case is reward modeling or offline response ranking for chat-style data.

Training Data

The model uses multi-objective scoring and preference data from:

Evaluation

Preference accuracy by domain:

Domain Accuracy (%)
Coherence 85.2052
Commonsense 97.8402
Empathy 95.1549
Multicultural 84.6998

Usage Example

This checkpoint uses the project's custom RewardModelWithGating class. Run the example from an environment where multidomain_model/modeling_custom.py is importable.

import torch
from transformers import AutoTokenizer
from modeling_custom import RewardModelWithGating

model_id = "mario-rc/multi-domain-rm-mistral-7b-it"
dtype = torch.bfloat16 if torch.cuda.is_available() else torch.float32
device_map = {"": 0} if torch.cuda.is_available() else None

tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = RewardModelWithGating.from_pretrained(
    model_id,
    device_map=device_map,
    dtype=dtype,
).eval()
device = next(model.parameters()).device

messages = [
    {"role": "user", "content": "I failed an important exam and feel awful."},
    {"role": "assistant", "content": "I'm sorry. That is a hard setback, but it does not define your ability. Take a little time to recover, then we can make a concrete study plan for the next attempt."},
]

encoded = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    padding=True,
    truncation=True,
    max_length=4096,
)
inputs = {"input_ids": encoded.to(device)} if isinstance(encoded, torch.Tensor) else {
    key: value.to(device) for key, value in encoded.items()
}

with torch.no_grad():
    score = model(**inputs).score.float().item()

print(score)

Limitations

This is a reward model, not a standalone chat assistant. Scores are intended for relative comparison and should be calibrated for each downstream use case. The model inherits limitations from its base model and from the annotation coverage of the multi-domain datasets, especially for cultural contexts not represented in the evaluation data.

Credits

This model is based on the ArmoRM/RLHFlow reward-modeling approach and adapts it to custom multi-domain attributes for coherence, commonsense, empathy, and multicultural response quality.

Downloads last month
67
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mario-rc/multi-domain-rm-mistral-7b-it

Finetuned
(1)
this model

Datasets used to train mario-rc/multi-domain-rm-mistral-7b-it

Collection including mario-rc/multi-domain-rm-mistral-7b-it