llmc-gpt2-774M-150B / README.md
mdouglas's picture
mdouglas HF staff
Create README.md
45f1ea0 verified
|
raw
history blame
No virus
385 Bytes
metadata
license: mit
datasets:
  - HuggingFaceFW/fineweb
language:
  - en

llm.c checkpoint: GPT-2 774M

This is a HF/safetensors conversion of the llm.c checkpoint of a 774M parameter run on 150B tokens from FineWeb.

Training was conducted on a single 8xA100 80GB SXM node for ~6 days.

See discussion on GitHub for more information.