🚧"raw" pretrained smol_llama checkpoints - WIP 🚧
BEEspoke Data
community
AI & ML interests
'an LLM is only as good as the dataset it was trained on' - Sun Tzu
Organization Card
About org cards
🐝📊💁
Collections
6
smol_llama 220M fine-tunes we did
-
BEE-spoke-data/smol_llama-220M-openhermes
Text Generation • Updated • 3.75k • 2 -
BEE-spoke-data/smol_llama-220M-open_instruct
Text Generation • Updated • 2.09k • 1 -
BEE-spoke-data/beecoder-220M-python
Text Generation • Updated • 189 • 2 -
BEE-spoke-data/zephyr-220m-sft-full
Text Generation • Updated • 3.51k • 1
spaces
1
models
39
BEE-spoke-data/Jamba-900M-doc-writer
Text Generation
•
Updated
•
8
•
1
BEE-spoke-data/mega-ar-350m-L3t-v0.08-ultraTBfw
Text Generation
•
Updated
•
64
•
1
BEE-spoke-data/Meta-Llama-3-8Bee
Text Generation
•
Updated
•
2.01k
BEE-spoke-data/claude-tokenizer
Updated
BEE-spoke-data/TinyLlama-3T-1.1bee
Text Generation
•
Updated
•
2.07k
•
2
BEE-spoke-data/bert-plus-L8-v1.0-allNLI_matryoshka
Sentence Similarity
•
Updated
•
1
BEE-spoke-data/bert-plus-L8-v1.0-synthSTSv3-4k
Sentence Similarity
•
Updated
•
4
BEE-spoke-data/mega-encoder-small-16k-v1
Fill-Mask
•
Updated
•
8
•
4
BEE-spoke-data/mega-small-embed-synthSTS-16384-v1
Sentence Similarity
•
Updated
•
3
•
4
BEE-spoke-data/bert-plus-L8-v1.0-syntheticSTS-4k
Sentence Similarity
•
Updated
•
6
•
4
datasets
53
BEE-spoke-data/UltraTextbooks-2.1-fw_mix
Viewer
•
Updated
•
84
•
2
BEE-spoke-data/fineweb-literature-100k
Viewer
•
Updated
BEE-spoke-data/fineweb-cryptid-5k
Viewer
•
Updated
BEE-spoke-data/MoistWeb-25k
Viewer
•
Updated
BEE-spoke-data/FineMeme-100k
Viewer
•
Updated
•
1
BEE-spoke-data/beeweb-5k
Viewer
•
Updated
•
2
BEE-spoke-data/fineweb-synergy-20k
Viewer
•
Updated
BEE-spoke-data/SaunaWeb-50k
Viewer
•
Updated
BEE-spoke-data/rp_books-en
Viewer
•
Updated
•
7
•
1
BEE-spoke-data/gutenberg-en-v1-clean
Viewer
•
Updated
•
19
•
2