Edit model card

Pretrained toy models. Made with Andrej Karpathy's NanoGPT.

nano_35m

  • Trained late 2023 on part of Tagalog portion of Belebele.
  • batch_size = 64
  • block_size = 256
  • n_layer = 8
  • n_head = 8
  • n_embd = 768
  • Everything else is left as is.

nano_76m

  • Trained January 2024 on part of Tagalog portion of Belebele.
  • batch_size = 64
  • block_size = 256
  • n_layer = 11
  • n_head = 16
  • n_embd = 768
  • Everything else is left as is.

nano-ito_35m

  • Trained March 2024 on part of PALITO Tagalog dataset.
  • batch_size = 64
  • block_size = 256
  • n_layer = 11
  • n_head = 16
  • n_embd = 512
  • Everything else is left as is.
Downloads last month
0
Unable to determine this model's library. Check the docs .

Dataset used to train 922-Narra/nano_tagalog_models