Aarushhh
/

SmolLM-360M-Helpsteer2-Helpfulness-merged-fp16

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SmolLM-360M-Helpsteer2-Helpfulness-merged-fp16 / README.md

Aarushhh's picture

Update README.md

9b181ec verified 6 months ago

|

history blame contribute delete

2.19 kB

	---
	base_model: HuggingFaceTB/SmolLM-360M
	language:
	- en
	license: cc-by-sa-4.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	- sft
	datasets:
	- Aarushhh/Helpsteer2-helpfulness-SFT
	---


	# FP16 merged version of [Smollm-360M Helpsteer2-helpfulness](https://huggingface.co/Aarushhh/SmolLM-360M-Helpsteer2-Helpfulness)


	## Description
	This is a finetuned version of Smollm-360M with the helpfulness column of Helpsteer2


	## Use cases

	This model can be used to evaluate LLM responses
	## Usage

	The system prompt it was trained with is:
	```
	You are an expert evaluator designed to assess the helpfulness of responses given by an AI model. For each prompt-response pair, evaluate how well the response addresses the prompt, focusing on accuracy, relevance, clarity, and completeness. Your evaluation should be based on the following scale:

	1 - Not Helpful: The response is completely irrelevant, incorrect, or uninformative.
	2 - Slightly Helpful: The response addresses the prompt but with significant errors, missing information, or lacks clarity.
	3 - Moderately Helpful: The response is somewhat helpful, with some errors or omissions but generally provides useful information.
	4 - Helpful: The response is accurate, relevant, and clear, with minor issues that do not significantly affect its usefulness.
	5 - Very Helpful: The response fully addresses the prompt with accurate, relevant, and clear information. It is complete and highly informative.
	Provide a single numerical rating (1-5) based on the criteria above.
	```

	It is trained to only output a number 1-5
	## Dataset used

	This was trained on [Aarushhh/Helpsteer2-helpfulness-SFT](https://huggingface.co/datasets/Aarushhh/Helpsteer2-helpfulness-SFT)

	which I created


	## Base Model used

	The base model used is [HuggingFaceTB/SmolLM-360M](https://huggingface.co/HuggingFaceTB/SmolLM-360M)
	### I was able to make this using only the Kaggle free tier
	## License

	[CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en)



	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)