Edit model card

Use cases

This model is used to deep clean the Rhino dataset, making it a higher quality dataset. This model achieved an average MSE loss of 0.095 during training. We recommend to use the sigmoid function to turn the logits into probabilities:

1 / (1 + torch.exp(logits))

Training

Using trl's RewardTrainer, this model was trained on berkeley-nest/Nectar. The dataset is curated on-the-fly during training, as explained in the Rhino repo.

Downloads last month
1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train M4-ai/TinyMistral-248M-v2-cleaner