tFINE-base-300m / README.md
pszemraj's picture
Update README.md
53f39f4 verified
|
raw
history blame
465 Bytes
metadata
license: apache-2.0
datasets:
  - HuggingFaceTB/smollm-corpus
language:
  - en
library_name: transformers
pipeline_tag: text2text-generation
tags:
  - fineweb
  - t5
  • 1024 ctx
  • SiLU activations
  • fineweb-edu-dedup split of HuggingFaceTB/smollm-corpus

plots

training loss

loss

grad norm

grad

weights norm

weights