Updated • 69
• 11
Updated • 92
• 8
Updated • 156
• 7
Masking Teacher and Reinforcing Student for Distilling Vision-Language Models
Paper
• 2512.22238
• Published • 30
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
Paper
• 2512.17012
• Published • 48
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields
Paper
• 2601.03252
• Published • 104
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits
Paper
• 2512.20578
• Published • 86
RyeCatcher/speculative-decoding-cross-domain-analysis
Updated
SceneDiff: A Benchmark and Method for Multiview Object Change Detection
Paper
• 2512.16908
• Published • 1
PaperBanana: Automating Academic Illustration for AI Scientists
Paper
• 2601.23265
• Published • 223
Qwen3-TTS Technical Report
Paper
• 2601.15621
• Published • 74
Updated • 5.69k
• 374
SAM 3D: 3Dfy Anything in Images
Paper
• 2511.16624
• Published • 114
Can Large Language Models Understand Context?
Paper
• 2402.00858
• Published • 24
More Agents Is All You Need
Paper
• 2402.05120
• Published • 57
OLMo: Accelerating the Science of Language Models
Paper
• 2402.00838
• Published • 85
Finetuned Multimodal Language Models Are High-Quality Image-Text Data
Filters
Paper
• 2403.02677
• Published • 18