Gaze detection using Moondream
Unified Framework for Generalized Video Face Restoration
Dense Grounded Understanding of Images and Videos
FitDiT is a high-fidelity virtual try-on model.
GANs are so back!
https://huggingface.co./papers/2501.03006
Video Super-Resolution with Text-to-Video Model
Audio Conditioned LipSync with Latent Diffusion Models