sTOpid-SLM-114M 🥴
🙏 Acknowledgements & Attribution
This model, sTOpid-SLM-114M, was built using the following open-source resources:
- Tokenizer: Uses the GPT-2 Tokenizer (BPE) provided by OpenAI.
- Dataset: Trained on the FineWeb-Edu dataset by Hugging Face.
- Underlying Data: CommonCrawl (Subject to their Terms of Use).
We gratefully acknowledge the OpenAI team for the GPT-2 architecture and the Hugging Face FineWeb team for the high-quality training data.
📊 Model Information
- Training Data: ~590M Tokens of FineWeb-Edu
- Hardware: NVIDIA RTX PRO 6000 Blackwell Server Edition
⚠️ Disclaimer & Behavior
- Not Fully Trained: This model is in an early stage. It has many hallucinations that can make people smile, but they are logically incorrect.
- Do Not Trust: Do not trust the model for facts, math, or serious advice. It is confidently wrong about almost everything.
- Purpose: Created for research, laughter, and pushing the limits of Blackwell hardware.
License: MIT