Zesen Cheng
ClownRat
AI & ML interests
multi-modal foundation model; Segmentation, Detection, and Tracking;
Recent Activity
published
a dataset
1 day ago
ClownRat/YoutubeVIS-2019
upvoted
a
paper
1 day ago
Valley2: Exploring Multimodal Models with Scalable Vision-Language
Design
upvoted
a
paper
1 day ago
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of
Encoders