Aarushhh commited on
Commit
5a46543
·
verified ·
1 Parent(s): f7e2cd2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -4
README.md CHANGED
@@ -12,12 +12,47 @@ tags:
12
  - sft
13
  ---
14
 
15
- # Uploaded model
16
 
17
- - **Developed by:** Aarushhh
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
- - **Finetuned from model :** HuggingFaceTB/SmolLM-360M
20
 
21
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
 
23
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
12
  - sft
13
  ---
14
 
 
15
 
16
+ # FP16 merged version of [Smollm-360M Helpsteer2-helpfulness](https://huggingface.co/Aarushhh/SmolLM-360M-Helpsteer2-Helpfulness)
17
+
18
+
19
+ ## Description
20
+ This is a finetuned version of Smollm-360M with the helpfulness column of Helpsteer2
21
+
22
+
23
+ ## Use cases
24
+
25
+ This model can be used to evaluate LLM responses
26
+ ## Usage
27
+
28
+ The system prompt it was trained with is:
29
+ ```
30
+ You are an expert evaluator designed to assess the helpfulness of responses given by an AI model. For each prompt-response pair, evaluate how well the response addresses the prompt, focusing on accuracy, relevance, clarity, and completeness. Your evaluation should be based on the following scale:
31
+
32
+ 1 - Not Helpful: The response is completely irrelevant, incorrect, or uninformative.
33
+ 2 - Slightly Helpful: The response addresses the prompt but with significant errors, missing information, or lacks clarity.
34
+ 3 - Moderately Helpful: The response is somewhat helpful, with some errors or omissions but generally provides useful information.
35
+ 4 - Helpful: The response is accurate, relevant, and clear, with minor issues that do not significantly affect its usefulness.
36
+ 5 - Very Helpful: The response fully addresses the prompt with accurate, relevant, and clear information. It is complete and highly informative.
37
+ Provide a single numerical rating (1-5) based on the criteria above.
38
+ ```
39
+
40
+ It is trained to only output a number 1-5
41
+ ## Dataset used
42
+
43
+ This was trained on [Aarushhh/Helpsteer2-helpfulness-SFT](https://huggingface.co/datasets/Aarushhh/Helpsteer2-helpfulness-SFT)
44
+
45
+ which I created
46
+
47
+
48
+ ## Base Model used
49
+
50
+ The base model used is [HuggingFaceTB/SmolLM-360M](https://huggingface.co/HuggingFaceTB/SmolLM-360M)
51
+ ### I was able to make this using only the Kaggle free tier
52
+ ## License
53
+
54
+ [CC-BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en)
55
 
 
56
 
 
57
 
58
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)