🚧"raw" pretrained smol_llama checkpoints - WIP 🚧
BEEspoke Data
community
AI & ML interests
'an LLM is only as good as the dataset it was trained on' - Sun Tzu
Organization Card
About org cards
🐝📊💁
Collections
6
smol_llama 220M fine-tunes we did
-
BEE-spoke-data/smol_llama-220M-openhermes
Text Generation • Updated • 4.3k • 2 -
BEE-spoke-data/smol_llama-220M-open_instruct
Text Generation • Updated • 2.33k • 1 -
BEE-spoke-data/beecoder-220M-python
Text Generation • Updated • 29 • 2 -
BEE-spoke-data/zephyr-220m-sft-full
Text Generation • Updated • 3.45k • 1
spaces
1
models
39
BEE-spoke-data/mega-ar-350m-L3t-v0.08-ultraTBfw
Text Generation
•
Updated
•
51
•
1
BEE-spoke-data/Meta-Llama-3-8Bee
Text Generation
•
Updated
•
1.11k
BEE-spoke-data/claude-tokenizer
Updated
BEE-spoke-data/TinyLlama-3T-1.1bee
Text Generation
•
Updated
•
2.1k
•
2
BEE-spoke-data/bert-plus-L8-v1.0-allNLI_matryoshka
Sentence Similarity
•
Updated
•
1
BEE-spoke-data/bert-plus-L8-v1.0-synthSTSv3-4k
Sentence Similarity
•
Updated
•
5
BEE-spoke-data/mega-encoder-small-16k-v1
Fill-Mask
•
Updated
•
11
•
4
BEE-spoke-data/mega-small-embed-synthSTS-16384-v1
Sentence Similarity
•
Updated
•
7
•
4
BEE-spoke-data/bert-plus-L8-v1.0-syntheticSTS-4k
Sentence Similarity
•
Updated
•
4
•
3
BEE-spoke-data/smol_llama-220M-openhermes
Text Generation
•
Updated
•
4.3k
•
2
datasets
48
BEE-spoke-data/SaunaWeb-50k
Viewer
•
Updated
BEE-spoke-data/beeweb-5k
Viewer
•
Updated
BEE-spoke-data/UltraTextbooks-2.1-fw_mix
Viewer
•
Updated
•
2
BEE-spoke-data/rp_books-en
Viewer
•
Updated
•
3
BEE-spoke-data/gutenberg-en-v1-clean
Viewer
•
Updated
•
21
•
2
BEE-spoke-data/napierone-epub-raw
Viewer
•
Updated
BEE-spoke-data/napierone-pdf-raw
Viewer
•
Updated
BEE-spoke-data/fineweb-1000_64k
Viewer
•
Updated
BEE-spoke-data/the-stack-smol-xl-readable
Viewer
•
Updated
•
1
BEE-spoke-data/Nvidia-DeepLearningExamples
Viewer
•
Updated
•
1