fdsqefsgergd's picture

fdsqefsgergd

T-representer

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 3 hours ago

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding

upvoted a paper about 3 hours ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

upvoted a paper about 3 hours ago

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

View all activity

Organizations

None yet

T-representer's activity

upvoted 8 papers about 3 hours ago

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding

Paper • 2501.05452 • Published 4 days ago • 4

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published 3 days ago • 21

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published 4 days ago • 21

OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Paper • 2501.03841 • Published 6 days ago • 33

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published 3 days ago • 33

Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains

Paper • 2501.05707 • Published 3 days ago • 3

ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning

Paper • 2501.04698 • Published 5 days ago • 5

Multi-subject Open-set Personalization in Video Generation

Paper • 2501.06187 • Published 3 days ago • 4

liked a model about 17 hours ago

vikhyatk/moondream1

Text Generation • Updated Feb 7, 2024 • 112k • 486

liked 3 models 2 days ago

Qwen/Qwen2-Audio-7B

Audio-Text-to-Text • Updated Nov 20, 2024 • 10.2k • 89

Qwen/Qwen2-VL-72B-Instruct

Image-Text-to-Text • Updated 1 day ago • 128k • 255

Qwen/QVQ-72B-Preview

Image-Text-to-Text • Updated 1 day ago • 123k • 490

upvoted a paper 2 days ago

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

Paper • 2308.12966 • Published Aug 24, 2023 • 8

liked 3 models 2 days ago

Qwen/Qwen2-VL-7B

Image-Text-to-Text • Updated 1 day ago • 11.1k • 32

Qwen/Qwen2-VL-2B-Instruct

Image-Text-to-Text • Updated 1 day ago • 1.8M • 360

Qwen/QwQ-32B-Preview

Text Generation • Updated 1 day ago • 139k • 1.54k

upvoted a paper 2 days ago

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution

Paper • 2501.05040 • Published 4 days ago • 11

liked a dataset 3 days ago

amphion/Emilia-Dataset

Viewer • Updated Sep 6, 2024 • 52.9M • 37k • 190

liked 2 models 3 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 7 days ago • 11.4k • 863

mistralai/Pixtral-Large-Instruct-2411

Image-Text-to-Text • Updated 18 days ago • 2 • 385