Update README.md
Browse files
README.md
CHANGED
@@ -47,14 +47,12 @@ Data sampling weights:
|
|
47 |
|
48 |
## Performance
|
49 |
|
50 |
-
[INSERT FIGURE: Performance comparison across models]
|
51 |
-
|
52 |
Key improvements over Gemma-2B baseline:
|
53 |
- HellaSwag-DE: +71% (47.9% vs 28.0%)
|
54 |
- ARC-DE: +41% (32.3% vs 22.9%)
|
55 |
- Average zero-shot: +40% (35.8% vs 25.5%)
|
56 |
|
57 |
-
|
58 |
|
59 |
<table class="model-comparison">
|
60 |
<thead>
|
|
|
47 |
|
48 |
## Performance
|
49 |
|
|
|
|
|
50 |
Key improvements over Gemma-2B baseline:
|
51 |
- HellaSwag-DE: +71% (47.9% vs 28.0%)
|
52 |
- ARC-DE: +41% (32.3% vs 22.9%)
|
53 |
- Average zero-shot: +40% (35.8% vs 25.5%)
|
54 |
|
55 |
+
→ BübleLM-2B onsistently outperforms both the base Gemma-2B and other German models like LLaMmlein-1B across most tasks.
|
56 |
|
57 |
<table class="model-comparison">
|
58 |
<thead>
|