Update README.md
Browse files
README.md
CHANGED
@@ -52,6 +52,8 @@ I've been asked what this is. For each layer, I use mergekit io to extract each
|
|
52 |
* Normalized: I take and divide each layer by it's norm before the transform, and then scale back up by multiplying the result by a midpoint from the norms of the tensors after the inverse. It's commutative, so it's more efficient to do it pre-complex.
|
53 |
* Denoised Fourier Interpolation: I first treat the tensor to a 2d fourier transform; then merge the tensors using SLERP or addition; then zero out the weights below a threshold percentage (a somewhat high 2%, but remains coherent on all the positions I tested, if a bit drier and sloppier as you go up).
|
54 |
|
|
|
|
|
55 |
### Format
|
56 |
|
57 |
Use Llama3 Instruct format.
|
|
|
52 |
* Normalized: I take and divide each layer by it's norm before the transform, and then scale back up by multiplying the result by a midpoint from the norms of the tensors after the inverse. It's commutative, so it's more efficient to do it pre-complex.
|
53 |
* Denoised Fourier Interpolation: I first treat the tensor to a 2d fourier transform; then merge the tensors using SLERP or addition; then zero out the weights below a threshold percentage (a somewhat high 2%, but remains coherent on all the positions I tested, if a bit drier and sloppier as you go up).
|
54 |
|
55 |
+
Of course, you need to know how to handle the imaginary portion; but if you don't, it's best to just pick one and pass that through.
|
56 |
+
|
57 |
### Format
|
58 |
|
59 |
Use Llama3 Instruct format.
|