Gaze detection using Moondream
Unified Framework for Generalized Video Face Restoration
Dense Grounded Understanding of Images and Videos
FitDiT is a high-fidelity virtual try-on model.
GANs are so back!
https://huggingface.co/papers/2501.03006
Video Super-Resolution with Text-to-Video Model
Audio Conditioned LipSync with Latent Diffusion Models