Safetensors
qwen2
spectrum
sft
dpo
Eval Results
DavidGF commited on
Commit
a6aa46f
·
verified ·
1 Parent(s): 33a53b1

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +131 -0
README.md ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - de
5
+ - en
6
+ tags:
7
+ - spectrum
8
+ - sft
9
+ - dpo
10
+ datasets:
11
+ - VAGOsolutions/SauerkrautLM-Fermented-GER-DPO
12
+ - VAGOsolutions/SauerkrautLM-Fermented-Irrelevance-GER-DPO
13
+ ---
14
+
15
+
16
+ ![SauerkrautLM-v2-14b-DPO](https://vago-solutions.ai/wp-content/uploads/2024/08/SauerkrautLM-v2-14b-DPO.png "SauerkrautLM-v2-14b-DPO")
17
+ ## VAGO solutions SauerkrautLM-v2-14b-DPO
18
+
19
+ **Fine-tuned Model** - *Enhanced DPO-tuned version with focus on English performance and german function calling irrelevance optimization*
20
+
21
+ Introducing **SauerkrautLM-v2-14b-DPO** – our advanced DPO-tuned version based on [SauerkrautLM-v2-14b-SFT](https://huggingface.co/VAGOsolutions/SauerkrautLM-v2-14b-SFT)!
22
+
23
+ - Three-phase training approach combining SFT and DPO
24
+ - Enhanced English language performance while maintaining German capabilities
25
+ - Optimized function calling with improved irrelevance handling
26
+ - Released with two new community datasets for custom training
27
+
28
+ # Table of Contents
29
+ 1. [Overview of all SauerkrautLM-v2-14b Models](#all-SauerkrautLM-v2-14b)
30
+ 2. [Model Details](#model-details)
31
+ - [Training procedure](#training-procedure)
32
+ 3. [Released Datasets](#released-datasets)
33
+ 4. [Evaluation](#evaluation)
34
+ 5. [Disclaimer](#disclaimer)
35
+ 6. [Contact](#contact)
36
+ 7. [Collaborations](#collaborations)
37
+ 8. [Acknowledgement](#acknowledgement)
38
+
39
+ ## All SauerkrautLM-v2-14b
40
+
41
+ | Model | HF | EXL2 | GGUF | AWQ |
42
+ |-------|-------|-------|-------|-------|
43
+ | SauerkrautLM-14b-v2-SFT | [Link](https://huggingface.co/VAGOsolutions/SauerkrautLM-v2-14b-SFT) | coming soon | coming soon | coming soon |
44
+ | SauerkrautLM-14b-v2-DPO | [Link](https://huggingface.co/VAGOsolutions/SauerkrautLM-v2-14b-DPO) | coming soon | coming soon | coming soon |
45
+
46
+ ## Model Details
47
+ **SauerkrautLM-v2-14b-DPO**
48
+ - **Base Model:** [SauerkrautLM-v2-14b-SFT](https://huggingface.co/VAGOsolutions/SauerkrautLM-v2-14b-SFT)
49
+ - **Language(s):** English (primary), German
50
+ - **License:** Apache 2.0
51
+ - **Contact:** [VAGO solutions](https://vago-solutions.ai)
52
+
53
+ ## Training Procedure
54
+
55
+ This model extends our two-phase SFT model with an additional DPO phase, creating a comprehensive three-phase training approach:
56
+
57
+ **Phase 1 & 2 (SFT)**:
58
+ - Identical to SauerkrautLM-v2-14b-SFT training
59
+ - Phase 1: 25% layer targeting with 0.6B tokens
60
+ - Phase 2: 20% layer targeting with 0.6B tokens
61
+
62
+ **Phase 3 (DPO)**:
63
+ - Spectrum Fine-Tuning targeting 15% of layers
64
+ - Training on 80M tokens
65
+ - Focus on English performance optimization
66
+ - Integration of German performance preservation
67
+ - Enhanced german function calling irrelevance handling
68
+
69
+ **Dataset Composition for DPO**:
70
+ - Extended previous DPO dataset
71
+ - New SauerkrautLM-Fermented-GER-DPO dataset
72
+ - SauerkrautLM-Fermented-Irrelevance-GER-DPO dataset
73
+ - Carefully balanced to maintain German language capabilities
74
+
75
+ ## Released Datasets
76
+
77
+ As part of this release, we're making two new datasets available to the community:
78
+
79
+ **SauerkrautLM-Fermented-GER-DPO**:
80
+ - 3,300 high-quality German training samples
81
+ - Multiple judgment criteria for flexible filtering
82
+ - Enables customized training approaches
83
+ - Comprehensive metadata for sample selection
84
+
85
+ **SauerkrautLM-Fermented-Irrelevance-GER-DPO**:
86
+ - 2,000 specialized German training samples
87
+ - Focus on function calling irrelevance optimization
88
+ - Multiple filtering criteria included
89
+ - Designed for community experimentation
90
+
91
+ ## Objective and Results
92
+
93
+ This DPO-enhanced version aims to:
94
+ - Optimize English language performance
95
+ - Maintain German language capabilities
96
+ - Improve german function calling irrelevance handling
97
+ - Provide valuable training resources to the community
98
+
99
+ ## Evaluation
100
+
101
+ **AGIEVAL**
102
+ ![SauerkrautLM-v2-14b-DPO-AGIEVAL](https://vago-solutions.ai/wp-content/uploads/2024/08/AGIeval-14b-dpo.png "SauerkrautLM-v2-14b-DPO-AGIEVAL")
103
+
104
+ **GPT4ALL**
105
+ ![SauerkrautLM-v2-14b-DPO-GPT4ALL](https://vago-solutions.ai/wp-content/uploads/2024/08/GPT4ALL-14b-dpo.png "SauerkrautLM-v2-14b-DPO-GPT4ALL")
106
+
107
+ **TRUTHFULQA**
108
+ ![SauerkrautLM-v2-14b-DPO-TRUTHFULQA](https://vago-solutions.ai/wp-content/uploads/2024/08/TQA-14b-dpo.png "SauerkrautLM-v2-14b-DPO-TRUTHFULQA")
109
+
110
+ **OPENLEADERBOARD 2**
111
+ ![SauerkrautLM-14b-v2-DPO-OPENLEADERBOARD](https://vago-solutions.ai/wp-content/uploads/2024/08/HF2-14b-dpo.png "SauerkrautLM-v2-14b-DPO-OPENLEADERBOARD")
112
+
113
+ **MMLU 5-shot**
114
+ ![SauerkrautLM-14b-v2-DPO-MMLU-5shot](https://vago-solutions.ai/wp-content/uploads/2024/08/MMLU-14b-dpo.png "SauerkrautLM-v2-14b-DPO-MMLU-5shot")
115
+
116
+ **Berkeley Function Calling Leaderboard**
117
+ ![SauerkrautLM-v2-14b-DPO-BERKELEY](https://vago-solutions.ai/wp-content/uploads/2024/08/Berkeley-14b-dpo.png "SauerkrautLM-v2-14b-DPO-BERKELEY")
118
+
119
+ Please note that our benchmark results in absolute numbers may differ from the Hugging Face Leaderboard due to variations in benchmark evaluation pipelines. However, the relative differences remain consistent.
120
+
121
+ ## Disclaimer
122
+ We must inform users that despite our best efforts in data cleansing, the possibility of uncensored content slipping through cannot be entirely ruled out. However, we cannot guarantee consistently appropriate behavior. Therefore, if you encounter any issues or come across inappropriate content, we kindly request that you inform us through the contact information provided. Additionally, it is essential to understand that the licensing of these models does not constitute legal advice. We are not held responsible for the actions of third parties who utilize our models.
123
+
124
+ ## Contact
125
+ If you are interested in customized LLMs for business applications, please get in contact with us via our website. We are also grateful for your feedback and suggestions.
126
+
127
+ ## Collaborations
128
+ We are also keenly seeking support and investment for our startup, VAGO solutions where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us at [VAGO solutions](https://vago-solutions.ai)
129
+
130
+ ## Acknowledgement
131
+ Many thanks to [Qwen](https://huggingface.co/Qwen) for providing such a valuable base model, and to our community for their continued support and engagement.