Wentao Ma

tonymwt

https://iamtonymwt.github.io/

iamtonymwt

AI & ML interests

MLLM GenAI Robotics

Recent Activity

upvoted a paper 3 days ago

MoCha: Towards Movie-Grade Talking Character Synthesis

published a Space 7 days ago

tonymwt/LVU_Leaderboard

updated a Space 7 days ago

tonymwt/LVU_Leaderboard

View all activity

Organizations

None yet

tonymwt's activity

upvoted a paper 3 days ago

MoCha: Towards Movie-Grade Talking Character Synthesis

Paper • 2503.23307 • Published 5 days ago • 64

published a Space 7 days ago

LVU VLM Leaderboard

🥇

A leaderboard for long video understanding

updated a Space 7 days ago

LVU VLM Leaderboard

🥇

A leaderboard for long video understanding

authored a paper 17 days ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published 20 days ago • 18

upvoted a paper 17 days ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published 20 days ago • 18

upvoted a paper 21 days ago

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published 21 days ago • 20

upvoted a paper 4 months ago

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Paper • 2412.00927 • Published Dec 1, 2024 • 27

liked a model 11 months ago

microsoft/Phi-3-vision-128k-instruct

Text Generation • Updated Aug 20, 2024 • 46.9k • 959

upvoted an article 11 months ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14, 2024

• 244

liked 2 models about 1 year ago

meta-llama/Llama-2-7b-chat-hf

Text Generation • Updated Apr 17, 2024 • 1.28M • • 4.35k

TheBloke/Llama-2-7B-Chat-GPTQ

Text Generation • Updated Sep 27, 2023 • 18.4k • 263