3 60 89

Zeyi Sun

Zery

https://github.com/SunzeY

AI & ML interests

Recent Activity

upvoted a collection about 13 hours ago

ViRFT Datasets

liked a Space 5 days ago

aleafy/RelightVid

authored a paper 6 days ago

Visual-RFT: Visual Reinforcement Fine-Tuning

View all activity

Organizations

None yet

Zery's activity

upvoted a collection about 13 hours ago

ViRFT Datasets

Collection

ViRFT Datasets • 8 items • Updated 14 days ago • 5

liked a Space 5 days ago

RelightVid

💡

authored a paper 6 days ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published 6 days ago • 59

upvoted a paper 6 days ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published 6 days ago • 59

commented a paper 6 days ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published 6 days ago • 59 •

upvoted a paper 6 days ago

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting

Paper • 2501.16330 • Published Jan 27 • 1

upvoted a paper 12 days ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published 12 days ago • 69

authored a paper 14 days ago

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting

Paper • 2501.16330 • Published Jan 27 • 1

liked a Space 14 days ago

376

OmniParser V2

🏢

OmniParser, turn your LLM into GUI agent

liked a model 14 days ago

microsoft/OmniParser-v2.0

Image-Text-to-Text • Updated 20 days ago • 8.78k • 1.12k

upvoted a paper 18 days ago

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published 19 days ago • 37

liked a dataset 23 days ago

OS-Copilot/OS-Atlas-data

Updated Dec 4, 2024 • 20.7k • 16

liked a dataset 26 days ago

osunlp/Mind2Web

Viewer • Updated Jul 19, 2023 • 253 • 655 • 100

liked a model 27 days ago

ysmikey/Layerpano3D-FLUX-Panorama-LoRA

Text-to-Image • Updated 29 days ago • • 3

upvoted a paper 28 days ago

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published about 1 month ago • 64

upvoted a paper about 2 months ago

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Paper • 2501.12368 • Published Jan 21 • 42

liked a model about 2 months ago

physical-intelligence/fast

Robotics • Updated Jan 16 • 85

liked a Space about 2 months ago

2.96k

IC Light V2

📈

Run code from environment variable

upvoted 2 papers 2 months ago

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction

Paper • 2501.03218 • Published Jan 6 • 36

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

Paper • 2501.03226 • Published Jan 6 • 41