anakin87 commited on
Commit
76e5b9c
1 Parent(s): b946408

link to GGUF version

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -23,6 +23,8 @@ language:
23
  This is an ORPO fine-tune of [google/gemma-2b](https://huggingface.co/google/gemma-2b) with
24
  [`alvarobartt/dpo-mix-7k-simplified`](https://huggingface.co/datasets/alvarobartt/dpo-mix-7k-simplified).
25
 
 
 
26
  ## ORPO
27
  [ORPO (Odds Ratio Preference Optimization)](https://arxiv.org/abs/2403.07691) is a new training paradigm that combines the usually separated phases
28
  of SFT (Supervised Fine-Tuning) and Preference Alignment (usually performed with RLHF or simpler methods like DPO).
 
23
  This is an ORPO fine-tune of [google/gemma-2b](https://huggingface.co/google/gemma-2b) with
24
  [`alvarobartt/dpo-mix-7k-simplified`](https://huggingface.co/datasets/alvarobartt/dpo-mix-7k-simplified).
25
 
26
+ **⚡ Quantized version (GGUF)**: https://huggingface.co/anakin87/gemma-2b-orpo-GGUF
27
+
28
  ## ORPO
29
  [ORPO (Odds Ratio Preference Optimization)](https://arxiv.org/abs/2403.07691) is a new training paradigm that combines the usually separated phases
30
  of SFT (Supervised Fine-Tuning) and Preference Alignment (usually performed with RLHF or simpler methods like DPO).