CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper • 2503.10613 • Published 3 days ago • 60
Automated Movie Generation via Multi-Agent CoT Planning Paper • 2503.07314 • Published 6 days ago • 37
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation Paper • 2503.10618 • Published 3 days ago • 16
Distilling Diversity and Control in Diffusion Models Paper • 2503.10637 • Published 3 days ago • 12
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 4 days ago • 49
Gemma 2 2B Release Collection The 2.6B parameter version of Gemma 2. • 6 items • Updated 4 days ago • 79
ShieldGemma Release Collection A series of safety classifiers, trained on top of Gemma 2, for developers to filter inputs and outputs of their applications. • 3 items • Updated 4 days ago • 12
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published 10 days ago • 61
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published 6 days ago • 53
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching Paper • 2503.05179 • Published 9 days ago • 42
Running 2.26k 2.26k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published 13 days ago • 72