Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
sail
's Collections
🧬 RegMix: Data Mixture as Regression
📈 Scaling Laws with Vocabulary
💡 DICE
⚓️ Sailor Language Models
💡 DICE
updated
Jul 28
Self-alignment with DPO Implicit Rewards
Upvote
8
Bootstrapping Language Models with DPO Implicit Rewards
Paper
•
2406.09760
•
Published
Jun 14
•
38
sail/Llama-3-Base-8B-DICE-Iter1
Text Generation
•
Updated
Jul 11
•
6
•
1
sail/Llama-3-Base-8B-DICE-Iter2
Text Generation
•
Updated
Jul 11
•
6
•
2
sail/Zephyr-7B-DICE-Iter1
Text Generation
•
Updated
Jul 11
•
7
sail/Zephyr-7B-DICE-Iter2
Text Generation
•
Updated
Jul 11
•
7
Upvote
8
+4
Share collection
View history
Collection guide
Browse collections