BramVanroy
commited on
Commit
β’
01761f1
1
Parent(s):
e3dd55d
Update README.md
Browse files
README.md
CHANGED
@@ -32,10 +32,6 @@ This model is a fine-tuned version of [BramVanroy/GEITje-7B-ultra-sft](https://h
|
|
32 |
> [!TIP]
|
33 |
> π Looking for the fast GGUF version? You can find it, and how to use it with `ollama`, [here](https://huggingface.co/BramVanroy/GEITje-7B-ultra-GGUF). π
|
34 |
|
35 |
-
## Model description
|
36 |
-
|
37 |
-
This is a Dutch instruction/chat model ultimately based on Mistral and aligned with AI feedback via DPO. It is a DPO continuation of the SFT trained [BramVanroy/GEITje-7B-ultra-sft](https://huggingface.co/BramVanroy/GEITje-7B-ultra-sft), which in turn is based on [Rijgersberg/GEITje-7B](https://huggingface.co/Rijgersberg/GEITje-7B), which in turn is based on Mistral 7B and further pretrained on Dutch data. In (rather naive) [benchmarks](https://huggingface.co/spaces/BramVanroy/open_dutch_llm_leaderboard) it outperforms all the original GEITje models on average (but barely). However, note that these benchmarks should be taken with a massive grain of salt (see the disclaimer below the benchmarks on that page). The best evaluation is to try the models and see for yourself.
|
38 |
-
|
39 |
## Citation
|
40 |
|
41 |
If you use GEITje 7B Ultra (SFT) or any of its derivatives or quantizations, place cite the following paper:
|
@@ -52,6 +48,11 @@ If you use GEITje 7B Ultra (SFT) or any of its derivatives or quantizations, pla
|
|
52 |
}
|
53 |
```
|
54 |
|
|
|
|
|
|
|
|
|
|
|
55 |
## Usage
|
56 |
|
57 |
One-off:
|
|
|
32 |
> [!TIP]
|
33 |
> π Looking for the fast GGUF version? You can find it, and how to use it with `ollama`, [here](https://huggingface.co/BramVanroy/GEITje-7B-ultra-GGUF). π
|
34 |
|
|
|
|
|
|
|
|
|
35 |
## Citation
|
36 |
|
37 |
If you use GEITje 7B Ultra (SFT) or any of its derivatives or quantizations, place cite the following paper:
|
|
|
48 |
}
|
49 |
```
|
50 |
|
51 |
+
## Model description
|
52 |
+
|
53 |
+
This is a Dutch instruction/chat model ultimately based on Mistral and aligned with AI feedback via DPO. It is a DPO continuation of the SFT trained [BramVanroy/GEITje-7B-ultra-sft](https://huggingface.co/BramVanroy/GEITje-7B-ultra-sft), which in turn is based on [Rijgersberg/GEITje-7B](https://huggingface.co/Rijgersberg/GEITje-7B), which in turn is based on Mistral 7B and further pretrained on Dutch data. In (rather naive) [benchmarks](https://huggingface.co/spaces/BramVanroy/open_dutch_llm_leaderboard) it outperforms all the original GEITje models on average (but barely). However, note that these benchmarks should be taken with a massive grain of salt (see the disclaimer below the benchmarks on that page). The best evaluation is to try the models and see for yourself.
|
54 |
+
|
55 |
+
|
56 |
## Usage
|
57 |
|
58 |
One-off:
|