Enhancing Paraphrase Type Generation
Collection
Enhancing Paraphrase Type Generation: The Impact of DPO and RLHF Evaluated with Human-Ranked Data
•
6 items
•
Updated
This model is a fine-tuned version of meta-llama/Llama-3.1-8B on the None dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
No log | 1.0 | 34 | 0.7886 | 0.4925 |
No log | 2.0 | 68 | 0.7860 | 0.4925 |
No log | 3.0 | 102 | 0.7851 | 0.4925 |
Base model
meta-llama/Llama-3.1-8B