ablation-models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated about 21 hours ago • 22