|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- berkeley-nest/Nectar |
|
language: |
|
- en |
|
--- |
|
# Use cases |
|
This model is used to deep clean the Rhino dataset, making it a higher quality dataset. This model achieved an average MSE loss of 0.095 during training. |
|
We recommend to use the sigmoid function to turn the logits into probabilities: |
|
```python |
|
1 / (1 + torch.exp(logits)) |
|
``` |
|
# Training |
|
Using trl's RewardTrainer, this model was trained on berkeley-nest/Nectar. The dataset is curated on-the-fly during training, as explained in the Rhino repo. |
|
|