PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 16 days ago • 104
Comparing DPO with IPO and KTO Collection A collection of chat models to explore the differences between three alignment techniques: DPO, IPO, and KTO. • 56 items • Updated Jan 9 • 31
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 253
Awesome feedback datasets Collection A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO. • 19 items • Updated Apr 12 • 53
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 93