CMU-LTI

university

LTIatCMU

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Xuhui authored a paper 6 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

gneubig authored a paper 7 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

gneubig authored a paper 17 days ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

View all activity

cmu-lti's activity

Xuhui

authored a paper 6 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published 7 days ago • 43

gneubig

authored a paper 7 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published 7 days ago • 43

gneubig

authored a paper 17 days ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published 20 days ago • 46

gneubig

authored a paper 20 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 21 days ago • 43

seungone

authored a paper 20 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 21 days ago • 43

gneubig

authored a paper about 1 month ago

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Paper • 2411.14199 • Published Nov 21 • 29

aashiqmuhamed

authored a paper about 2 months ago

Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models

Paper • 2411.00743 • Published Nov 1 • 6

seungone

authored 2 papers about 2 months ago

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

Paper • 2410.17578 • Published Oct 23 • 1

Better Instruction-Following Through Minimum Bayes Risk

Paper • 2410.02902 • Published Oct 3

gneubig

authored a paper 2 months ago

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Paper • 2410.17250 • Published Oct 22 • 14

skhanuja

authored a paper 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

gneubig

authored a paper 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

Nyandwi

authored a paper 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

seungone

authored a paper 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

gneubig

authored a paper 2 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18 • 36

skhanuja

authored a paper 2 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18 • 36

zhiqiulin

authored a paper 2 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18 • 36

Nyandwi

authored a paper 2 months ago

NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples

Paper • 2410.14669 • Published Oct 18 • 36

gneubig

authored a paper 2 months ago

Harnessing Webpage UIs for Text-Rich Visual Understanding

Paper • 2410.13824 • Published Oct 17 • 29

zhiqings

authored a paper 2 months ago

An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Paper • 2408.00724 • Published Aug 1 • 1