Update README.md
Browse files
README.md
CHANGED
@@ -52,7 +52,7 @@ Optimized with **LaserRMT**
|
|
52 |
- **License:** Apache 2.0
|
53 |
- **Contact:** [Website VAGO solutions](https://vago-solutions.de/#Kontakt), [Website Hyperspace.ai](https://hyperspace.ai/)
|
54 |
|
55 |
-
###
|
56 |
|
57 |
Anyone who has attempted or succeeded in fine-tuning a model is aware of the difficulty in nudging it towards a specific skill, such as mastering new languages, as well as the challenges associated with achieving significant improvements in performance. Experimenting with a novel training strategy and Spherical Linear Interpolation alongside a lasered version of the model itself has proven to be both fascinating and revealing.
|
58 |
Furthermore, we developed one iteration of the model using our entire SFT -Sauerkraut dataset and two additional iterations using subsets of the full dataset—one focused on enhancing MMLU and TQA capabilities, and the other on boosting GSM8K and Winogrande skills.
|
|
|
52 |
- **License:** Apache 2.0
|
53 |
- **Contact:** [Website VAGO solutions](https://vago-solutions.de/#Kontakt), [Website Hyperspace.ai](https://hyperspace.ai/)
|
54 |
|
55 |
+
### Training procedure:
|
56 |
|
57 |
Anyone who has attempted or succeeded in fine-tuning a model is aware of the difficulty in nudging it towards a specific skill, such as mastering new languages, as well as the challenges associated with achieving significant improvements in performance. Experimenting with a novel training strategy and Spherical Linear Interpolation alongside a lasered version of the model itself has proven to be both fascinating and revealing.
|
58 |
Furthermore, we developed one iteration of the model using our entire SFT -Sauerkraut dataset and two additional iterations using subsets of the full dataset—one focused on enhancing MMLU and TQA capabilities, and the other on boosting GSM8K and Winogrande skills.
|