BeaverAI
/

Tunguska-39B-v1b-GGUF

Inference Endpoints

Model card Files Files and versions Community

TheDrummer commited on 11 days ago

Commit

e1e3a7d

•

1 Parent(s): 6d8441e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ My cute attempt at being a siyantis :3 uwu ~
 ## Conclusions (WIP)
 - Upscaling can 'provide room' for further training.
 - Training upscaled models will result in retaining more of the original model's performance & behavior.
-- A 600MB dataset did not seem to completely fill the empty/duplicated layers. (Rate of change remained the same for epoch 1 & 2)
 - (Not related to upscaling) The first two layers are sus - their weights are wildly different from the original. I wonder if we could recover smarts by merging that back in with base, or if those layers contain the most influence and must be preserved.
 ## What is the 39B Upscale?

 ## Conclusions (WIP)
 - Upscaling can 'provide room' for further training.
 - Training upscaled models will result in retaining more of the original model's performance & behavior.
+- A 600MB dataset was nowhere near in stabilizing the empty/duplicated layers. (Pertubed rate of change remained the same for epoch 1 & 2)
 - (Not related to upscaling) The first two layers are sus - their weights are wildly different from the original. I wonder if we could recover smarts by merging that back in with base, or if those layers contain the most influence and must be preserved.
 ## What is the 39B Upscale?