|
--- |
|
base_model: HuggingFaceTB/SmolLM-360M |
|
language: |
|
- en |
|
license: cc-by-sa-4.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
- sft |
|
datasets: |
|
- Aarushhh/Helpsteer2-helpfulness-SFT |
|
--- |
|
|
|
|
|
# FP16 merged version of [Smollm-360M Helpsteer2-helpfulness](https://huggingface.co/Aarushhh/SmolLM-360M-Helpsteer2-Helpfulness) |
|
|
|
|
|
## Description |
|
This is a finetuned version of Smollm-360M with the helpfulness column of Helpsteer2 |
|
|
|
|
|
## Use cases |
|
|
|
This model can be used to evaluate LLM responses |
|
## Usage |
|
|
|
The system prompt it was trained with is: |
|
``` |
|
You are an expert evaluator designed to assess the helpfulness of responses given by an AI model. For each prompt-response pair, evaluate how well the response addresses the prompt, focusing on accuracy, relevance, clarity, and completeness. Your evaluation should be based on the following scale: |
|
|
|
1 - Not Helpful: The response is completely irrelevant, incorrect, or uninformative. |
|
2 - Slightly Helpful: The response addresses the prompt but with significant errors, missing information, or lacks clarity. |
|
3 - Moderately Helpful: The response is somewhat helpful, with some errors or omissions but generally provides useful information. |
|
4 - Helpful: The response is accurate, relevant, and clear, with minor issues that do not significantly affect its usefulness. |
|
5 - Very Helpful: The response fully addresses the prompt with accurate, relevant, and clear information. It is complete and highly informative. |
|
Provide a single numerical rating (1-5) based on the criteria above. |
|
``` |
|
|
|
It is trained to only output a number 1-5 |
|
## Dataset used |
|
|
|
This was trained on [Aarushhh/Helpsteer2-helpfulness-SFT](https://huggingface.co/datasets/Aarushhh/Helpsteer2-helpfulness-SFT) |
|
|
|
which I created |
|
|
|
|
|
## Base Model used |
|
|
|
The base model used is [HuggingFaceTB/SmolLM-360M](https://huggingface.co/HuggingFaceTB/SmolLM-360M) |
|
### I was able to make this using only the Kaggle free tier |
|
## License |
|
|
|
[CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en) |
|
|
|
|
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |