prometheus-eval

university

AI & ML interests

None defined yet.

Recent Activity

scottsuk0306 new activity about 16 hours ago

prometheus-eval/prometheus-7b-v2.0:Tokenizer chat template doesn't accept system prompt

scottsuk0306 authored a paper 20 days ago

Evaluating Language Models as Synthetic Data Generators

Seongyun authored a paper 20 days ago

Evaluating Language Models as Synthetic Data Generators

View all activity

prometheus-eval's activity

scottsuk0306

in prometheus-eval/prometheus-7b-v2.0 about 16 hours ago

Tokenizer chat template doesn't accept system prompt

#3 opened 5 months ago by

scottsuk0306

authored a paper 20 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 21 days ago • 43

Seongyun

authored a paper 20 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 21 days ago • 43

seungone

authored a paper 20 days ago

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published 21 days ago • 43

seungone

updated 2 models 27 days ago

prometheus-eval/prometheus-8x7b-v2.0

Text2Text Generation • Updated 27 days ago • 1.19k • 47

prometheus-eval/prometheus-7b-v2.0

Text2Text Generation • Updated 27 days ago • 23k • 83

seungone

updated a dataset about 1 month ago

prometheus-eval/MMQA

Viewer • Updated Nov 18 • 330 • 47 • 3

nlee-208

authored a paper about 2 months ago

Cross-lingual Transfer of Reward Models in Multilingual Alignment

Paper • 2410.18027 • Published Oct 23

seungone

authored 2 papers about 2 months ago

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

Paper • 2410.17578 • Published Oct 23 • 1

Better Instruction-Following Through Minimum Bayes Risk

Paper • 2410.02902 • Published Oct 3

amphora

updated 2 datasets 2 months ago

prometheus-eval/MMQA

Viewer • Updated Nov 18 • 330 • 47 • 3

prometheus-eval/MM-Eval

Viewer • Updated Oct 26 • 11.1k • 178 • 5

DKYoon

updated a dataset 2 months ago

prometheus-eval/MM-Eval

Viewer • Updated Oct 26 • 11.1k • 178 • 5

seungone

authored a paper 2 months ago

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published Oct 21 • 43

hyungjoochae

authored a paper 2 months ago

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Paper • 2410.13232 • Published Oct 17 • 40

seungone

updated a dataset 2 months ago

prometheus-eval/BiGGen-Bench

Viewer • Updated Oct 16 • 765 • 259 • 12

hyungjoochae

authored a paper 3 months ago

Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code

Paper • 2409.19715 • Published Sep 29 • 8

nlee-208

authored 2 papers 3 months ago

Margin-aware Preference Optimization for Aligning Diffusion Models without Reference

Paper • 2406.06424 • Published Jun 10 • 12

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Paper • 2406.05761 • Published Jun 9 • 2

seungone

authored a paper 4 months ago

Consent in Crisis: The Rapid Decline of the AI Data Commons

Paper • 2407.14933 • Published Jul 20 • 12