Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs Paper • 2504.00072 • Published 14 days ago • 7
How far can we go with ImageNet for Text-to-Image generation? Paper • 2502.21318 • Published Feb 28 • 25
How far can we go with ImageNet for Text-to-Image generation? Paper • 2502.21318 • Published Feb 28 • 25
How far can we go with ImageNet for Text-to-Image generation? Paper • 2502.21318 • Published Feb 28 • 25 • 2
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities Paper • 2412.14123 • Published Dec 18, 2024 • 11
MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views Paper • 2412.06767 • Published Dec 9, 2024 • 7
Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation Paper • 2412.06781 • Published Dec 9, 2024 • 21
Don't drop your samples! Coherence-aware training benefits Conditional diffusion Paper • 2405.20324 • Published May 30, 2024
OpenStreetView-5M: The Many Roads to Global Visual Geolocation Paper • 2404.18873 • Published Apr 29, 2024