Update README.md
Browse files
README.md
CHANGED
@@ -7,9 +7,11 @@ tags:
|
|
7 |
- merge
|
8 |
|
9 |
---
|
10 |
-
#
|
|
|
11 |
|
12 |
-
|
|
|
13 |
|
14 |
## Merge Details
|
15 |
### Merge Method
|
|
|
7 |
- merge
|
8 |
|
9 |
---
|
10 |
+
# Llama-3-11.5B-Instruct
|
11 |
+
The core idea came from @jukofyork, see this [issue;](https://github.com/arcee-ai/mergekit/issues/198)
|
12 |
|
13 |
+
As I understand, The concept of the idea is to make model think twice but leap same distances like original. but why 0.7071067812?
|
14 |
+
> The scale factor to use, eg: solve x^2 = 1/2 --> x = 1/sqrt(2) ≈ 0.7071067812
|
15 |
|
16 |
## Merge Details
|
17 |
### Merge Method
|