Helpfulness Detection Model

Model Description

This model is designed to assess the helpfulness of assistant responses in a dialogue. It uses a binary classification approach by generating a simple "YES" or "NO" based on whether the assistant's response is helpful, relevant, and accurate in answering the user's query.

The model is fine-tuned with LoRA (Low-Rank Adaptation) on the Meta-Llama-3-8B-Instruct base model, making it efficient and lightweight while leveraging the power of large language models.

Key aspects:

Evaluates helpfulness of responses.
Binary output (YES/NO) based on textual generation.
Silent reasoning is performed internally, with the output limited to a definitive binary response.

Intended Use

Assessing the quality of answers generated by conversational AI.
Analyzing the relevance, clarity, and accuracy of assistant responses.
Providing a binary evaluation for further analysis in NLP tasks.

How to Use

Here is an example of how to use the model for helpfulness evaluation:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Base model and tokenizer
base_model_name = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model_name) #Make sure you use a token for accessing meta/llama
base_model = AutoModelForCausalLM.from_pretrained(base_model_name)

# Load the LoRA-adapted model
model = PeftModel.from_pretrained(base_model, "juliushase/helpfulness-detection")

# Define the dialogue
messages = [
    {"role": "system", "content": "You are a scientist whose only task is to analyze if the answers from the assistant in the dialogue are helpful and answer the human's questions. Silently reason through the steps of analyzing the assistant's response, considering its relevance, clarity, and accuracy. After your analysis, only respond with YES or NO."},
    {"role": "user", "content": "What is the capital of the United States of America"},
    {"role": "assistant", "content": "The capital of the United States of America is Washington D.C."}
]

# Prepare input for the model
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

# Specify stop tokens for generation
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

# Generate the response
outputs = model.generate(
    input_ids,
    max_new_tokens=1,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.001,
    top_p=1,
)

# Decode the binary response (YES/NO)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support