Upscaled models using the Block Expansion method. Unlike the more common DUP Scaling, BE doesn't require fine-tuning to recover lost performance.
-
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Interleaved
Text Generation • 8B • Updated • 10 • 2 -
Pretergeek/OpenChat-3.5-0106_8.99B_40Layers-Interleaved
Text Generation • 9B • Updated • 8 • 2 -
Pretergeek/OpenChat-3.5-0106_10.7B_48Layers-Interleaved
Text Generation • 11B • Updated • 11 • 2 -
Pretergeek/OpenChat-3.5-0106_8.11B_36Layers-Appended
Text Generation • 8B • Updated • 15 • 2