TheDrummer
commited on
Commit
•
a7d5987
1
Parent(s):
1d36bb4
Update README.md
Browse files
README.md
CHANGED
@@ -3,6 +3,11 @@
|
|
3 |
|
4 |
# Upscaled Tuning Experiment Write Up Thingy
|
5 |
|
|
|
|
|
|
|
|
|
|
|
6 |
## What is the 39B Upscale?
|
7 |
|
8 |
https://huggingface.co/TheSkullery/BA-Zephyria-39b
|
|
|
3 |
|
4 |
# Upscaled Tuning Experiment Write Up Thingy
|
5 |
|
6 |
+
## Conclusions (WIP)
|
7 |
+
- Upscaling can 'provide room' for further training.
|
8 |
+
- Training upscaled models will result in retaining more of the original model's performance & behavior.
|
9 |
+
- (Not related to upscaling) The first two layers are sus - their weights are wildly different from the original. I wonder if we could recover smarts by merging that back in with base, or if those layers contain the most influence.
|
10 |
+
|
11 |
## What is the 39B Upscale?
|
12 |
|
13 |
https://huggingface.co/TheSkullery/BA-Zephyria-39b
|