dvilasuero HF staff commited on
Commit
476767c
1 Parent(s): 70922ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -2
README.md CHANGED
@@ -19,10 +19,76 @@ tags:
19
  </div>
20
 
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  | Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average | dpo-pairs | % original pairs |
23
  |-------------------------------------------------------------------------------------------------------------------|--------:|--------:|-----------:|---------:|--------:|----------:|-----------------:|
24
  | [argilla/distilabeled-Hermes-2.5-Mistral-7B](https://huggingface.co/argilla/distilabeled-Hermes-2.5-Mistral-7B) | **44.64** | **73.35** | 55.96 | 42.21 | **54.04** | 5,922 | **46%** |
25
  | [dvilasuero/NeuralHermes-2.5-Mistral-7B-distilabel](https://huggingface.co/dvilasuero/NeuralHermes-2.5-Mistral-7B-distilabel) (first experiment) | 44.27 | 73.3 | **56.26** | **42.25** | 54.02 | 7,732 | 60% |
26
  | mlabonne/NeuralHermes-2.5-Mistral-7B (original recipe) | 43.67 | 73.24 | 55.37 | 41.76 | 53.51 | 12,859 | 100% |
27
- | teknium/OpenHermes-2.5-Mistral-7B | 42.75 | 72.99 | 52.99 | 40.94 | 52.42| 0 (no DPO) | N/A |
28
-
 
19
  </div>
20
 
21
 
22
+ ## Introduction
23
+ This model is the launching partner of our new open dataset [argilla/distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs). It outperforms the awesome `mlabonne/NeuralHermes-2.5-Mistral-7B` with exactly the **same DPO recipe but 54% less data**.
24
+
25
+ The dataset is a "distilabeled" version of the widely used dataset: [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs). The original dataset has been used by 100s of open source practitioners and models. We knew from fixing UltraFeedback (and before that, Alpacas and Dollys) that this dataset could be highly improved.
26
+
27
+ Continuing with our mission to build the best alignment datasets for open source LLMs and the community, we spent a few hours to improve it with [distilabel](https://github.com/argilla-io/distilabel).
28
+
29
+ The main intuition was: the original dataset just assumes gpt4/3.5-turbo are always the best response. We know from UltraFeedback that's not always the case. Moreover, DPO fine-tuning benefits from diversity of preference pairs.
30
+
31
+ This is what it took to build a real preference dataset with distilabel:
32
+
33
+ ```python
34
+ from distilabel.llm import OpenAILLM
35
+ from distilabel.tasks import UltraFeedbackTask, JudgeLMTask
36
+ from distilabel.pipeline import Pipeline
37
+
38
+ from datasets import load_dataset
39
+
40
+ dataset = load_dataset("Intel/orca_dpo_pairs", split="train")
41
+
42
+ # this shuffles the pairs to mitigate positional bias
43
+ dataset = dataset.map(lambda x: shuffle_and_track(x["chosen"], x["rejected"]))
44
+
45
+ # we use our JudgeLM implementation to rate the original pairs
46
+ labeler = OpenAILLM(
47
+ task=JudgeLMTask(),
48
+ model="gpt-4-1106-preview",
49
+ num_threads=16,
50
+ max_new_tokens=512,
51
+ )
52
+
53
+ dataset = dataset.rename_columns({"question": "input"})
54
+
55
+ distipipe = Pipeline(
56
+ labeller=labeler
57
+ )
58
+
59
+ # this computes ratings and natural language critiques for each pair
60
+ ds = distipipe.generate(dataset=dataset, num_generations=2)
61
+ ```
62
+ The resulting dataset is now much more useful: we know which response is preferred (by gpt-4-turbo), which ones have low scores, and we even have natural language explanations. But what did we find? Was our intuition confirmed?
63
+
64
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/60420dccc15e823a685f2b03/-V8wY1DYzrtwM9LbGrBXq.png)
65
+
66
+ The above chart shows the following:
67
+
68
+ * ~4,000 pairs were given the same rating (a tie).
69
+ * ~7,000 pairs were correct according to our AI judge (`unchanged`).
70
+ * and ~2,000 times the rejected response was preferred (`swapped`).
71
+
72
+ Now the next question is: can we build better models with this new knowledge? The answer is "distilabeled OpenHermes" so let's get back to the model!
73
+
74
+ ## Training details
75
+
76
+ ```python
77
+ from datasets import load_dataset
78
+
79
+ dataset = load_dataset("argilla/distilabel-intel-orca-dpo-pairs", split="train")
80
+
81
+ dataset = dataset.filter(
82
+ lambda r:
83
+ r["status"] != "tie" and
84
+ r["chosen_score"] >= 8 and
85
+ not r["in_gsm8k_train"]
86
+ )
87
+ ```
88
+ ## Benchmark results
89
  | Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average | dpo-pairs | % original pairs |
90
  |-------------------------------------------------------------------------------------------------------------------|--------:|--------:|-----------:|---------:|--------:|----------:|-----------------:|
91
  | [argilla/distilabeled-Hermes-2.5-Mistral-7B](https://huggingface.co/argilla/distilabeled-Hermes-2.5-Mistral-7B) | **44.64** | **73.35** | 55.96 | 42.21 | **54.04** | 5,922 | **46%** |
92
  | [dvilasuero/NeuralHermes-2.5-Mistral-7B-distilabel](https://huggingface.co/dvilasuero/NeuralHermes-2.5-Mistral-7B-distilabel) (first experiment) | 44.27 | 73.3 | **56.26** | **42.25** | 54.02 | 7,732 | 60% |
93
  | mlabonne/NeuralHermes-2.5-Mistral-7B (original recipe) | 43.67 | 73.24 | 55.37 | 41.76 | 53.51 | 12,859 | 100% |
94
+ | teknium/OpenHermes-2.5-Mistral-7B | 42.75 | 72.99 | 52.99 | 40.94 | 52.42| 0 (no DPO) | N/A |