perlthoughts commited on
Commit
34bc68a
1 Parent(s): 9513477

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -22
README.md CHANGED
@@ -6,6 +6,28 @@ license: apache-2.0
6
 
7
  <p><img src="https://huggingface.co/perlthoughts/Chupacabra-7B/resolve/main/chupacabra.jpeg" width=320></p>
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ## Purpose
10
 
11
  Merging the "thick"est model weights from mistral models using amazing training methods like deep probabilistic optimization (dpo) and reinforced learning.
@@ -28,28 +50,6 @@ Here is my contribution.
28
  GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:
29
  ```
30
 
31
- ### Model Description
32
-
33
- Based models that are based on Mistral.
34
-
35
- All model's weights were merged using the SLERP method. More information below.
36
-
37
- Advantages of SLERP method vs averaging weights are as follows:
38
-
39
- Spherical Linear Interpolation (SLERP)
40
- Traditionally, model merging often resorts to weight averaging which, although straightforward, might not always capture the intricate features of the models being merged. The SLERP technique addresses this limitation, producing a blended model with characteristics smoothly interpolated from both parent models, ensuring the resultant model captures the essence of both its parents.
41
-
42
- Smooth Transitions
43
- SLERP ensures smoother transitions between model parameters. This is especially significant when interpolating between high-dimensional vectors.
44
-
45
- Better Preservation of Characteristics
46
- Unlike weight averaging, which might dilute distinct features, SLERP preserves the curvature and characteristics of both models in high-dimensional spaces.
47
-
48
- Nuanced Blending
49
- SLERP takes into account the geometric and rotational properties of the models in the vector space, resulting in a blend that is more reflective of both parent models' characteristics.
50
-
51
- List of models merged coming soon as well as more information on merging techniques and methods.
52
-
53
  ### Bug fixes
54
 
55
  - Fixed issue with generation and the incorrect model weights. Model weights have been corrected and now generation works again. Reuploading GGUF to the GGUF repository as well as the AWQ versions.
 
6
 
7
  <p><img src="https://huggingface.co/perlthoughts/Chupacabra-7B/resolve/main/chupacabra.jpeg" width=320></p>
8
 
9
+ ### Model Description
10
+
11
+ Based models that are based on Mistral.
12
+
13
+ All model's weights were merged using the SLERP method. More information below.
14
+
15
+ Advantages of SLERP method vs averaging weights are as follows:
16
+
17
+ Spherical Linear Interpolation (SLERP)
18
+ Traditionally, model merging often resorts to weight averaging which, although straightforward, might not always capture the intricate features of the models being merged. The SLERP technique addresses this limitation, producing a blended model with characteristics smoothly interpolated from both parent models, ensuring the resultant model captures the essence of both its parents.
19
+
20
+ Smooth Transitions
21
+ SLERP ensures smoother transitions between model parameters. This is especially significant when interpolating between high-dimensional vectors.
22
+
23
+ Better Preservation of Characteristics
24
+ Unlike weight averaging, which might dilute distinct features, SLERP preserves the curvature and characteristics of both models in high-dimensional spaces.
25
+
26
+ Nuanced Blending
27
+ SLERP takes into account the geometric and rotational properties of the models in the vector space, resulting in a blend that is more reflective of both parent models' characteristics.
28
+
29
+ List of all models and merging path is coming soon.
30
+
31
  ## Purpose
32
 
33
  Merging the "thick"est model weights from mistral models using amazing training methods like deep probabilistic optimization (dpo) and reinforced learning.
 
50
  GPT4 User: {prompt}<|end_of_turn|>GPT4 Assistant:
51
  ```
52
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  ### Bug fixes
54
 
55
  - Fixed issue with generation and the incorrect model weights. Model weights have been corrected and now generation works again. Reuploading GGUF to the GGUF repository as well as the AWQ versions.