CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper • 2503.10613 • Published 3 days ago • 63
Automated Movie Generation via Multi-Agent CoT Planning Paper • 2503.07314 • Published 6 days ago • 37
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation Paper • 2503.10618 • Published 3 days ago • 16
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 4 days ago • 50
Gemma 2 2B Release Collection The 2.6B parameter version of Gemma 2. • 6 items • Updated 5 days ago • 79
ShieldGemma Release Collection A series of safety classifiers, trained on top of Gemma 2, for developers to filter inputs and outputs of their applications. • 3 items • Updated 5 days ago • 13
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published 10 days ago • 61
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published 6 days ago • 53
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching Paper • 2503.05179 • Published 10 days ago • 42
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published 13 days ago • 72
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published 19 days ago • 69
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated 5 days ago • 145
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 26 days ago • 65
DarwinLM: Evolutionary Structured Pruning of Large Language Models Paper • 2502.07780 • Published Feb 11 • 17