Update README.md
Browse files
README.md
CHANGED
@@ -61,7 +61,25 @@ as chosen answers and [Sauerkraut-7b-HerO](https://huggingface.co/VAGOsolutions/
|
|
61 |
We found, that only a simple translation of training data can lead to unnatural German phrasings.
|
62 |
Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
|
63 |
|
|
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
### Prompt Template:
|
67 |
```
|
|
|
61 |
We found, that only a simple translation of training data can lead to unnatural German phrasings.
|
62 |
Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
|
63 |
|
64 |
+
### Data Contamination Test Results
|
65 |
|
66 |
+
Some models on the HuggingFace leaderboard had problems with wrong data getting mixed in.
|
67 |
+
We checked our SauerkrautLM-DPO dataset with a special test [1] on a smaller model for this problem.
|
68 |
+
The HuggingFace team used the same methods [2, 3].
|
69 |
+
|
70 |
+
Our results, with `result < 0.1, %:` being well below 0.9, indicate that our dataset is free from contamination.
|
71 |
+
|
72 |
+
*The data contamination test results of HellaSwag and Winograde will be added once [1] supports them.*
|
73 |
+
|
74 |
+
| Dataset | ARC | MMLU | TruthfulQA | GSM8K |
|
75 |
+
|------------------------------|-------|-------|-------|-------|
|
76 |
+
| **SauerkrautLM-DPO**| result < 0.1, %: 0.0 |result < 0.1, %: 0.09 | result < 0.1, %: 0.13 | result < 0.1, %: 0.16 |
|
77 |
+
|
78 |
+
[1] https://github.com/swj0419/detect-pretrain-code-contamination
|
79 |
+
|
80 |
+
[2] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474#657f2245365456e362412a06
|
81 |
+
|
82 |
+
[3] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/265#657b6debf81f6b44b8966230
|
83 |
|
84 |
### Prompt Template:
|
85 |
```
|