VLM Benchmarks - a marcusinthesky Collection

marcusinthesky 's Collections

DS

Open-vocabulary object detection (OVD).

Multi-modal Mamba

Multimodal Embeddings

Tiny VLM Decoder

PeFT

Decoder Upcycled to Embeddings

VLM Benchmarks

updated Oct 15

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Paper • 2410.10139 • Published Oct 14 • 51
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Paper • 2410.10563 • Published Oct 14 • 38
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Paper • 2410.10783 • Published Oct 14 • 26
TVBench: Redesigning Video-Language Evaluation

Paper • 2410.07752 • Published Oct 10 • 5