Text Classification
Transformers
PyTorch
English
deberta-v2
reward-model
reward_model
RLHF
text-embeddings-inference
Instructions to use OpenAssistant/reward-model-deberta-v3-large-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OpenAssistant/reward-model-deberta-v3-large-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="OpenAssistant/reward-model-deberta-v3-large-v2")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("OpenAssistant/reward-model-deberta-v3-large-v2") model = AutoModelForSequenceClassification.from_pretrained("OpenAssistant/reward-model-deberta-v3-large-v2") - Inference
- Notebooks
- Google Colab
- Kaggle
Commit ·
c355404
1
Parent(s): 2d417a6
Update README.md
Browse files
README.md
CHANGED
|
@@ -77,7 +77,7 @@ Validation split accuracy
|
|
| 77 |
| **[deberta-v3-large-v2](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-large-v2)** | **61.57** | 71.47 | 99.88 | **69.25** |
|
| 78 |
| [deberta-v3-large](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-large) | 61.13 | 72.23 | **99.94** | 55.62 |
|
| 79 |
| [deberta-v3-base](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-base) | 59.07 | 66.84 | 99.85 | 54.51 |
|
| 80 |
-
| deberta-v2-xxlarge | 58.67 | 73.27 | 99.77 | 66.74 |
|
| 81 |
|
| 82 |
Its likely SytheticGPT has somekind of surface pattern on the choosen-rejected pair which makes it trivial to differentiate between better the answer.
|
| 83 |
|
|
|
|
| 77 |
| **[deberta-v3-large-v2](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-large-v2)** | **61.57** | 71.47 | 99.88 | **69.25** |
|
| 78 |
| [deberta-v3-large](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-large) | 61.13 | 72.23 | **99.94** | 55.62 |
|
| 79 |
| [deberta-v3-base](https://huggingface.co/OpenAssistant/reward-model-deberta-v3-base) | 59.07 | 66.84 | 99.85 | 54.51 |
|
| 80 |
+
| deberta-v2-xxlarge | 58.67 | **73.27** | 99.77 | 66.74 |
|
| 81 |
|
| 82 |
Its likely SytheticGPT has somekind of surface pattern on the choosen-rejected pair which makes it trivial to differentiate between better the answer.
|
| 83 |
|