maldv commited on
Commit
0109bb9
1 Parent(s): 54dfbb0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -50,7 +50,7 @@ I've been asked what this is. For each layer, I use mergekit io to extract each
50
 
51
  * Recursive Pairwise Disjoint: Using this information I build a stack of layer deltas. I'm a little compute limited, so I treat them in pairs. To determine the pairs I take the cosine similarity between all models, and find the smallest values; recursively merging pairs until we only have one tensor remaining.
52
  * Normalized: I take and divide each layer by it's norm, and then scale back up by multiplying the result by a midpoint from the norms of the tensors.
53
- * Denoised Fourier Interpolation: I first treat the tensor to a 2d fourier transform; then merge the tensors using SLERP or addition; then zero out the weights below a threshold percentage (a somewhat high 2%, but remains coherent all the way to 100%).
54
 
55
  ### Format
56
 
 
50
 
51
  * Recursive Pairwise Disjoint: Using this information I build a stack of layer deltas. I'm a little compute limited, so I treat them in pairs. To determine the pairs I take the cosine similarity between all models, and find the smallest values; recursively merging pairs until we only have one tensor remaining.
52
  * Normalized: I take and divide each layer by it's norm, and then scale back up by multiplying the result by a midpoint from the norms of the tensors.
53
+ * Denoised Fourier Interpolation: I first treat the tensor to a 2d fourier transform; then merge the tensors using SLERP or addition; then zero out the weights below a threshold percentage (a somewhat high 2%, but remains coherent on all the positions I tested, if a bit drier and sloppier as you go up).
54
 
55
  ### Format
56