ryota39
/

RakutenAI-7B-instruct-reward

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ryota39 commited on Jun 27

Commit

36a731d

•

1 Parent(s): d505d25

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -3,8 +3,8 @@ library_name: transformers
 tags: []
 ---
-- this model was trained to classify whether input text is chosen sentence or rejected text
-- the probability (logits after passing softmax function) in last layer of this model can be used to quantify the preferencefrom user input
 - fine-tuned [Rakuten/RakutenAI-7B-instruct](https://huggingface.co/Rakuten/RakutenAI-7B-instruct) via [LoRA](https://arxiv.org/abs/2106.09685) using [open-preference-v0.3](https://huggingface.co/datasets/ryota39/open_preference-v0.3)
 - trained on bf16 format

 tags: []
 ---
+- this model was trained to classify whether input text comes from "chosen sentence" or "rejected sentence"
+- the probability (logits after passing softmax function) in last layer of this model can be used to quantify the preference from user input
 - fine-tuned [Rakuten/RakutenAI-7B-instruct](https://huggingface.co/Rakuten/RakutenAI-7B-instruct) via [LoRA](https://arxiv.org/abs/2106.09685) using [open-preference-v0.3](https://huggingface.co/datasets/ryota39/open_preference-v0.3)
 - trained on bf16 format