DavidGF commited on
Commit
82dc0ab
1 Parent(s): b9c2d93

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -61,7 +61,25 @@ as chosen answers and [Sauerkraut-7b-HerO](https://huggingface.co/VAGOsolutions/
61
  We found, that only a simple translation of training data can lead to unnatural German phrasings.
62
  Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
63
 
 
64
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
  ### Prompt Template:
67
  ```
 
61
  We found, that only a simple translation of training data can lead to unnatural German phrasings.
62
  Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
63
 
64
+ ### Data Contamination Test Results
65
 
66
+ Some models on the HuggingFace leaderboard had problems with wrong data getting mixed in.
67
+ We checked our SauerkrautLM-DPO dataset with a special test [1] on a smaller model for this problem.
68
+ The HuggingFace team used the same methods [2, 3].
69
+
70
+ Our results, with `result < 0.1, %:` being well below 0.9, indicate that our dataset is free from contamination.
71
+
72
+ *The data contamination test results of HellaSwag and Winograde will be added once [1] supports them.*
73
+
74
+ | Dataset | ARC | MMLU | TruthfulQA | GSM8K |
75
+ |------------------------------|-------|-------|-------|-------|
76
+ | **SauerkrautLM-DPO**| result < 0.1, %: 0.0 |result < 0.1, %: 0.09 | result < 0.1, %: 0.13 | result < 0.1, %: 0.16 |
77
+
78
+ [1] https://github.com/swj0419/detect-pretrain-code-contamination
79
+
80
+ [2] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474#657f2245365456e362412a06
81
+
82
+ [3] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/265#657b6debf81f6b44b8966230
83
 
84
  ### Prompt Template:
85
  ```