Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
kirch
's Collections
Scotch & SOTA 🥃 Pt. 1: Big Boi LLM 🚛
Scotch & SOTA 🥃 Pt. 2: Quantized Small Boi LLM 👉👈
Scotch & SOTA 🥃 Pt. 3: Image Sorcery 🔮
Scotch & SOTA 🥃 Pt. 4: Pre-Training Datasets 📜
Scotch & SOTA 🥃 Pt. 5: Instruction Tuning Datasets 👩🏫
Scotch & SOTA 🥃 Pt. 6: Dialogue Tuning Datasets 💬
Scotch & SOTA 🥃 Pt. 7: Human Feedback Datasets 🫣
Scotch & SOTA 🥃 Pt. 4: Multi-Modal 🔀
Scotch & SOTA 🥃 Pt. 4: Pre-Training Datasets 📜
updated
Sep 25, 2023
We gotta start somewhere, these jsonl's aren't gonna train themselves.
Upvote
-
allenai/dolma
Updated
Apr 17
•
2.92k
•
747
allenai/peS2o
Updated
Nov 28, 2023
•
209
•
130
tiiuae/falcon-refinedweb
Viewer
•
Updated
Jun 20, 2023
•
968M
•
1.59k
•
759
CarperAI/pilev2-dev
Preview
•
Updated
Mar 13, 2023
•
19
ArtifactAI/arxiv_research_code
Viewer
•
Updated
Jul 26, 2023
•
4.72M
•
1
•
4
ArtifactAI/arxiv_cplusplus_research_code
Viewer
•
Updated
Jul 27, 2023
•
1.63M
•
4
•
4
bigcode/the-stack
Viewer
•
Updated
Apr 13, 2023
•
546M
•
3.03k
•
698
bigcode/starcoderdata
Viewer
•
Updated
May 16, 2023
•
207M
•
2.36k
•
341
cerebras/SlimPajama-627B
Viewer
•
Updated
Jul 7, 2023
•
2.16M
•
6.52k
•
387
euirim/goodwiki
Viewer
•
Updated
Sep 11, 2023
•
44.8k
•
203
•
47
nampdn-ai/tiny-textbooks
Viewer
•
Updated
37 minutes ago
•
73
•
1.21k
•
132
nampdn-ai/tiny-codes
Viewer
•
Updated
Sep 30, 2023
•
1.63M
•
352
•
203
roneneldan/TinyStories
Viewer
•
Updated
Dec 4, 2023
•
2.14M
•
35.3k
•
431
nampdn-ai/tiny-bridgedict
Viewer
•
Updated
Aug 4, 2023
•
17.6k
•
2
•
13
nampdn-ai/tiny-webtext
Viewer
•
Updated
Aug 27, 2023
•
2.32M
•
41
•
28
Upvote
-
Share collection
View history
Collection guide
Browse collections