MMMU

non-profit

https://mmmu-benchmark.github.io/

MMMU-Benchmark

Activity Feed Request to join this org

AI & ML interests

Multimodal Model Evaluation

Recent Activity

yuanshengni updated a dataset 1 day ago

MMMU/MMMU_Pro

wenhu authored a paper 4 days ago

ABC: Achieving Better Control of Multimodal Embeddings using VLMs

zhangysk authored a paper 11 days ago

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

View all activity

MMMU's activity

yuanshengni

updated a dataset 1 day ago

MMMU/MMMU_Pro

Viewer • Updated 1 day ago • 5.19k • 5.44k • 22

wenhu

authored a paper 4 days ago

ABC: Achieving Better Control of Multimodal Embeddings using VLMs

Paper • 2503.00329 • Published 9 days ago • 18

zhangysk

authored 3 papers 11 days ago

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

Paper • 2502.13059 • Published 19 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 17 days ago • 94

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

Paper • 2502.19361 • Published 11 days ago • 26

zhangysk

authored 2 papers 12 days ago

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models

Paper • 2502.16614 • Published 14 days ago • 24

Audio-FLAN: A Preliminary Release

Paper • 2502.16584 • Published 15 days ago • 33

a43992899

authored a paper 12 days ago

Audio-FLAN: A Preliminary Release

Paper • 2502.16584 • Published 15 days ago • 33

yuanshengni

authored a paper 17 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 17 days ago • 94

aaabiao

authored a paper 17 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 17 days ago • 94

yuanshengni

in MMMU/MMMU_Pro 24 days ago

ValueError: BuilderConfig 'standard' not found. Available: ['standard (10 options)', 'standard (4 options)', 'vision']

#5 opened 3 months ago by

shilinxu

aaabiao

authored a paper 27 days ago

Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM

Paper • 2502.06635 • Published 27 days ago • 4

zhangysk

authored 4 papers 28 days ago

yuexiang96

authored a paper about 1 month ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5 • 55

gneubig

authored a paper about 1 month ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5 • 55

DongfuJiang

authored a paper about 1 month ago

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Paper • 2502.01718 • Published Feb 3 • 29

RLSNLP

authored a paper about 1 month ago

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 35

AI & ML interests

Recent Activity

Team members 17

MMMU's activity

ValueError: BuilderConfig 'standard' not found. Available: ['standard (10 options)', 'standard (4 options)', 'vision']