SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training Paper • 2501.06842 • Published 25 days ago • 15
3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering Paper • 2501.05131 • Published 28 days ago • 34
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper • 2501.08292 • Published 23 days ago • 17
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding Paper • 2501.07888 • Published 23 days ago • 15
Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens Paper • 2501.07730 • Published 24 days ago • 16
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published 23 days ago • 32
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 23 days ago • 272
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization Paper • 2412.21037 • Published Dec 30, 2024 • 23
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published Dec 30, 2024 • 35
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response Paper • 2412.14922 • Published Dec 19, 2024 • 85
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published Dec 23, 2024 • 43
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published Dec 20, 2024 • 22
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps Paper • 2412.15035 • Published Dec 19, 2024 • 4
IDOL: Instant Photorealistic 3D Human Creation from a Single Image Paper • 2412.14963 • Published Dec 19, 2024 • 6
FastVLM: Efficient Vision Encoding for Vision Language Models Paper • 2412.13303 • Published Dec 17, 2024 • 13