metadata
license: apache-2.0
datasets:
- Open-Orca/SlimOrca
language:
- en
TeeZee/NEBULA-XB-v1.03
Experiment, can DUS be taken one or more steps further?
Technical notes:
- pretrained model v03 finetuned on 50k entries from SlimOrca dataset
- 18 layers removed from both models of finetuned GALAXY-XB-v03
- model has 108 layers (((48-12)*2)-18)*2 = 108
- second step in scaling DUS procedure
To evaluate
- model performance after merge, should be a little lover that GALAXY finetuned on 50k of slimorca