General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3 • 82
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model Paper • 2408.16767 • Published Aug 29 • 29
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 177