Running 2.51k 2.51k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
jasonkrone/pythia-1-dot-4b-deduped-111b-toks-mc-finetune-hpo-lr-with-mmlu Text Generation • Updated Jan 11
jasonkrone/pythia-1-dot-4b-deduped-69b-toks-mc-finetune-hpo-lr-with-mmlu Text Generation • Updated Jan 11
jasonkrone/pythia-1-dot-4b-deduped-27b-toks-try2-mc-finetune-hpo-lr-with-mmlu Text Generation • Updated Jan 11
jasonkrone/mmlu_with_mmlu_pro_train_and_concat_dev_val_for_dev_hpo Viewer • Updated Nov 15, 2024 • 21.1k • 35
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 37