Edit model card

LLamaStory-70M is a LLama Model Pre-trained on a story-generation dataset

About Training:

  • EasyDel Platform Used
  • TPU-v4
  • batch-size 2048
  • max positioning embedding 512
  • 12 Epochs (yet)

this model will be used to Debug 4 and 8 bit training and inference in JAX and Rust with EasyDel

Downloads last month
7
Safetensors
Model size
70.5M params
Tensor type
FP16
·

Dataset used to train erfanzar/LLamaStory-70M