7 105 54

Rui Zhao

ruizhaocv

https://ruizhaocv.github.io/

AI & ML interests

Multimodal and GenAI

Recent Activity

upvoted a paper 3 days ago

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

authored a paper 4 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

upvoted a paper 4 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

View all activity

Organizations

ruizhaocv's activity

upvoted a paper 3 days ago

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

Paper • 2503.00865 • Published 7 days ago • 55

authored a paper 4 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published 4 days ago • 15

upvoted a paper 4 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published 4 days ago • 15

commented a paper 4 days ago

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Paper • 2503.03651 • Published 4 days ago • 15 •

upvoted a paper 6 days ago

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Paper • 2503.01774 • Published 6 days ago • 37

upvoted a paper 14 days ago

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Paper • 2502.14397 • Published 18 days ago • 38

liked a model 15 days ago

Comfy-Org/HunyuanVideo_repackaged

Updated about 13 hours ago • 141

upvoted a paper 17 days ago

Dynamic Concepts Personalization from Single Videos

Paper • 2502.14844 • Published 17 days ago • 16

upvoted 2 papers 18 days ago

AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence

Paper • 2502.13943 • Published 18 days ago • 7

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 18 days ago • 157

upvoted 2 papers 19 days ago

Phantom: Subject-consistent video generation via cross-modal alignment

Paper • 2502.11079 • Published 21 days ago • 52

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

Paper • 2502.10458 • Published 26 days ago • 30

liked a model 19 days ago

Skywork/SkyReels-A1

Image-to-Video • Updated 5 days ago • 786 • 46

upvoted 2 papers 19 days ago

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

Paper • 2502.12148 • Published 20 days ago • 16

ReLearn: Unlearning via Learning for Large Language Models

Paper • 2502.11190 • Published 21 days ago • 29

upvoted 2 papers 20 days ago

Learning Getting-Up Policies for Real-World Humanoid Robots

Paper • 2502.12152 • Published 20 days ago • 37

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published 23 days ago • 51

upvoted 2 papers 24 days ago

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Paper • 2502.08590 • Published 25 days ago • 40

CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation

Paper • 2502.08639 • Published 25 days ago • 37

upvoted a paper 25 days ago

WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation

Paper • 2502.08047 • Published 26 days ago • 26