🚧"raw" pretrained smol_llama checkpoints - WIP 🚧
BEEspoke Data
community
AI & ML interests
'an LLM is only as good as the dataset it was trained on' - Sun Tzu
Organization Card
About org cards
🐝📊💁
Collections
5
smol_llama 220M fine-tunes we did
-
BEE-spoke-data/smol_llama-220M-openhermes
Text Generation • Updated • 6.11k • 2 -
BEE-spoke-data/smol_llama-220M-open_instruct
Text Generation • Updated • 2.5k • 1 -
BEE-spoke-data/beecoder-220M-python
Text Generation • Updated • 2.15k • 2 -
BEE-spoke-data/zephyr-220m-sft-full
Text Generation • Updated • 3.29k • 1
spaces
1
models
37
BEE-spoke-data/claude-tokenizer
Updated
BEE-spoke-data/TinyLlama-3T-1.1bee
Text Generation
•
Updated
•
2.17k
•
2
BEE-spoke-data/bert-plus-L8-v1.0-allNLI_matryoshka
Sentence Similarity
•
Updated
•
4
BEE-spoke-data/bert-plus-L8-v1.0-synthSTSv3-4k
Sentence Similarity
•
Updated
•
4
BEE-spoke-data/mega-encoder-small-16k-v1
Fill-Mask
•
Updated
•
9
•
4
BEE-spoke-data/mega-small-embed-synthSTS-16384-v1
Sentence Similarity
•
Updated
•
10
•
3
BEE-spoke-data/bert-plus-L8-v1.0-syntheticSTS-4k
Sentence Similarity
•
Updated
•
8
•
3
BEE-spoke-data/smol_llama-220M-openhermes
Text Generation
•
Updated
•
6.11k
•
2
BEE-spoke-data/NanoLlama-GQA-L10-A32_KV8-v13-KI
Text Generation
•
Updated
•
2.2k
•
1
BEE-spoke-data/smol_llama-220M-open_instruct
Text Generation
•
Updated
•
2.5k
•
1
datasets
35
BEE-spoke-data/bees-internal
Viewer
•
Updated
•
5
BEE-spoke-data/fineweb-100k_en-med
Viewer
•
Updated
•
1
BEE-spoke-data/fineweb-1M_en-med
Viewer
•
Updated
BEE-spoke-data/allNLI-sbert
Viewer
•
Updated
•
1
•
1
BEE-spoke-data/gutenberg-en-v1-clean
Viewer
•
Updated
•
2
BEE-spoke-data/coedit-reworded-deduped
Updated
•
5
BEE-spoke-data/yahoo_answers_topics-long-text
Viewer
•
Updated
•
1
BEE-spoke-data/sp500-edgar-10k-markdown
Viewer
•
Updated
•
1
BEE-spoke-data/consumer-finance-complaints
Viewer
•
Updated
BEE-spoke-data/financial-news-articles-filtered
Viewer
•
Updated
•
15