Switch-Transformers release Collection This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated Nov 6 • 6
zephyr story Collection sources mentioned by hf.co/thomwolf tweet: x.com/Thom_Wolf/status/1720503998518640703 • 8 items • Updated 29 days ago • 15
Distil-Whisper Models Collection The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated 4 days ago • 29
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 117 items • Updated 34 minutes ago • 234
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Oct 27 • 98
LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 25 items • Updated about 11 hours ago • 84
SEAHORSE release Collection The SEAHORSE metrics (as described in https://arxiv.org/abs/2305.13194). • 12 items • Updated Nov 2 • 11
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21 • 31
Localizing Object-level Shape Variations with Text-to-Image Diffusion Models Paper • 2303.11306 • Published Mar 20 • 5