DavidGF commited on
Commit
330eb18
1 Parent(s): 778ff71

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -1
README.md CHANGED
@@ -30,6 +30,7 @@ Aligned with **DPO**
30
  2. [Model Details](#model-details)
31
  - [Prompt template](#prompt-template)
32
  - [Training Dataset](#training-dataset)
 
33
  3. [Evaluation](#evaluation)
34
  5. [Disclaimer](#disclaimer)
35
  6. [Contact](#contact)
@@ -55,11 +56,29 @@ Aligned with **DPO**
55
 
56
  SauerkrautLM-Mixtral-8x7B-Instruct was trained with mix of German data augmentation and translated data.
57
  Aligned through **DPO** with our **new German SauerkrautLM-DPO dataset** based on parts of the SFT SauerkrautLM dataset
58
- as chosen answers and [Sauerkraut-7b-HerO](https://huggingface.co/VAGOsolutions/SauerkrautLM-7b-HerO) as rejected answers. Added with additional **translated Parts of the [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)** and **[argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo).**
59
  We found, that only a simple translation of training data can lead to unnatural German phrasings.
60
  Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
61
 
 
62
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
  ### Prompt Template:
65
  ```
 
30
  2. [Model Details](#model-details)
31
  - [Prompt template](#prompt-template)
32
  - [Training Dataset](#training-dataset)
33
+ - [Data Contamination Test](#data-contamination-test-results)
34
  3. [Evaluation](#evaluation)
35
  5. [Disclaimer](#disclaimer)
36
  6. [Contact](#contact)
 
56
 
57
  SauerkrautLM-Mixtral-8x7B-Instruct was trained with mix of German data augmentation and translated data.
58
  Aligned through **DPO** with our **new German SauerkrautLM-DPO dataset** based on parts of the SFT SauerkrautLM dataset
59
+ as chosen answers and [Sauerkraut-7b-HerO](https://huggingface.co/VAGOsolutions/SauerkrautLM-7b-HerO) as rejected answers. Added with additional **translated Parts of the [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)** (Our dataset do not contain any TruthfulQA prompts - check Data Contamination Test Results) and **[argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo).**
60
  We found, that only a simple translation of training data can lead to unnatural German phrasings.
61
  Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
62
 
63
+ ### Data Contamination Test Results
64
 
65
+ Some models on the HuggingFace leaderboard had problems with wrong data getting mixed in.
66
+ We checked our SauerkrautLM-DPO dataset with a special test [1] on a smaller model for this problem.
67
+ The HuggingFace team used the same methods [2, 3].
68
+
69
+ Our results, with `result < 0.1, %:` being well below 0.9, indicate that our dataset is free from contamination.
70
+
71
+ *The data contamination test results of HellaSwag and Winograde will be added once [1] supports them.*
72
+
73
+ | Dataset | ARC | MMLU | TruthfulQA | GSM8K |
74
+ |------------------------------|-------|-------|-------|-------|
75
+ | **SauerkrautLM-DPO**| result < 0.1, %: 0.0 |result < 0.1, %: 0.09 | result < 0.1, %: 0.13 | result < 0.1, %: 0.16 |
76
+
77
+ [1] https://github.com/swj0419/detect-pretrain-code-contamination
78
+
79
+ [2] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474#657f2245365456e362412a06
80
+
81
+ [3] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/265#657b6debf81f6b44b8966230
82
 
83
  ### Prompt Template:
84
  ```