NGME-LLama 264M

  • Trained on 4 A6000 for ~4 days
  • Trained ~4 Billion (4 * 16 * 768 * 100_000) Tokens
  • On C4 Corpus
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train PatrickHaller/ngme-llama-264M