DeepHermes Collection Preview models of hybrid reasoner Hermes series • 6 items • Updated 28 days ago • 27
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 170
view article Article Welcome FalconMamba: The first strong attention-free 7B model Aug 12, 2024 • 110
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion Paper • 2407.01392 • Published Jul 1, 2024 • 46
Wavelets Are All You Need for Autoregressive Image Generation Paper • 2406.19997 • Published Jun 28, 2024 • 32
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods Jan 18, 2024 • 55
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29, 2024 • 52