theblackcat102
/

deberta-v2-xxlarge-rm

 ---
 license: mit
+datasets:
+- openai/webgpt_comparisons
+- openai/summarize_from_feedback
+- Anthropic/hh-rlhf
+language:
+- en
 ---
+# Reward model on deberta-v2-xxlarge (1.5B)
+Reward model used in RLHF which is trained on webgpt, summarize from human feedback and Open Assistant user ranked dataset
+# Model Details
+## Model Description
+- **Developed by:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+## Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [Open Assistant](https://github.com/LAION-AI/Open-Assistant)
+- **Paper :** [Instruct GPT](https://cdn.openai.com/papers/Training_language_models_to_follow_instructions_with_human_feedback.pdf) : We try to replicate as close as we can on our hardware and existing datasets
+- **Demo [optional]:** [More Information Needed]
+# Uses
+This model was trained with human feedback comparison examples, which penalize bad or rude sentence with lower scores.
+## Direct Use
+```
+model_name = 'theblackcat102/deberta-v2-xxlarge-rm'
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+prompt = "I just got out of prison, any suggestion?"
+good_helpful = "I am sorry to hear about it, it must be a hard time inside"
+bad_text = "Stay away from me, you scumbag convict"
+pos = tokenizer(prompt, good_helpful, return_tensors='pt')
+neg = tokenizer(prompt, bad_text, return_tensors='pt')
+pos_score = model(**pos).logits[0]
+neg_score = model(**neg).logits[0]
+print(pos_score, neg_score)
+>> tensor([-1.3449], grad_fn=<SelectBackward0>) tensor([-2.0942], grad_fn=<SelectBackward0>)
+```
+## Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+## Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+# Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+## Recommendations
+How to use it as a rank function
+```python
+def divide_chunks(l, n):
+    # looping till length l
+    for i in range(0, len(l), n):
+        yield l[i:i + n]
+@torch.no_grad()
+def rank_model_fn(samples, **kwargs):
+    output_scores = []
+    for chunk_samples in divide_chunks(samples, 16):
+        is_empty = []
+        prefixes, postfixes = [], []
+        for sample in chunk_samples:
+            prefix, postfix = sample.split('[SEP]')
+            postfix = postfix.strip()
+            if len(postfix) == 0 or len(set(postfix)) <= 3:
+                is_empty.append(True)
+            else:
+                is_empty.append(False)
+            postfixes.append(postfix)
+            prefixes.append(prefix)
+        is_empty = np.array(is_empty)
+        inputs = rank_tokenizer(prefixes, postfixes, return_tensors="pt", padding=True)
+        inputs.pop("token_type_ids", None)
+        inputs =  { key: tensor.cuda() for key, tensor in inputs.items() }
+        scores = rank_model(**inputs).logits[:, 0].detach().cpu()
+        scores[is_empty] = -4
+        output_scores += [ s for s in scores ]
+    return torch.from_numpy(np.array(output_scores))
+```
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+# Training Details
+## Training Procedure
+checkout our training repo [here](https://github.com/LAION-AI/Open-Assistant/tree/main/model/reward/instructor)
+### Preprocessing [optional]
+[More Information Needed]
+### Training Hyperparameters
+```yaml
+model_name: microsoft/deberta-v2-xxlarge
+learning_rate: 2e-6
+scheduler: cosine
+gradient_checkpointing: false
+gradient_accumulation_steps: 12
+per_device_train_batch_size: 1
+per_device_eval_batch_size: 4
+warmup_steps: 600
+eval_steps: 1000000
+save_steps: 1000
+max_length: 512
+num_train_epochs: 2
+datasets:
+  - webgpt
+  - hfsummary
+  - anthropic_rlhf
+  - oa_private
+```
+### Speeds, Sizes, Times [optional]
+Trained on 8 A100 80G model, since we are using the same batch strategy as InstructGPT, using a batch_size of 1 actually equals to (N-1) batch where N refers to number of negative examples. Which is why I recommend using the largest VRAM GPU you can find to train this model.
+# Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+## Testing Data, Factors & Metrics
+### Testing Data
+<!-- This should link to a Data Card if possible. -->
+[More Information Needed]
+### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+## Results
+[More Information Needed]
+### Summary
+# Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+# Technical Specifications [optional]
+## Model Architecture and Objective
+[More Information Needed]
+## Compute Infrastructure
+[More Information Needed]
+### Hardware
+[More Information Needed]
+### Software
+[More Information Needed]
+# Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+# Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+# More Information [optional]
+[More Information Needed]
+# Model Card Authors [optional]
+[More Information Needed]
+# Model Card Contact
+[More Information Needed]