maldv commited on
Commit
bcac3ba
1 Parent(s): 9b62d38

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -52,6 +52,8 @@ I've been asked what this is. For each layer, I use mergekit io to extract each
52
  * Normalized: I take and divide each layer by it's norm before the transform, and then scale back up by multiplying the result by a midpoint from the norms of the tensors after the inverse. It's commutative, so it's more efficient to do it pre-complex.
53
  * Denoised Fourier Interpolation: I first treat the tensor to a 2d fourier transform; then merge the tensors using SLERP or addition; then zero out the weights below a threshold percentage (a somewhat high 2%, but remains coherent on all the positions I tested, if a bit drier and sloppier as you go up).
54
 
 
 
55
  ### Format
56
 
57
  Use Llama3 Instruct format.
 
52
  * Normalized: I take and divide each layer by it's norm before the transform, and then scale back up by multiplying the result by a midpoint from the norms of the tensors after the inverse. It's commutative, so it's more efficient to do it pre-complex.
53
  * Denoised Fourier Interpolation: I first treat the tensor to a 2d fourier transform; then merge the tensors using SLERP or addition; then zero out the weights below a threshold percentage (a somewhat high 2%, but remains coherent on all the positions I tested, if a bit drier and sloppier as you go up).
54
 
55
+ Of course, you need to know how to handle the imaginary portion; but if you don't, it's best to just pick one and pass that through.
56
+
57
  ### Format
58
 
59
  Use Llama3 Instruct format.