Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Question About Accuracy Calculation in HaluEval
#28 opened 17 days ago
by
SungJoo
use_remote_code=True
#27 opened 3 months ago
by
schuler

Accessing examples used for n-shot evals
#26 opened 5 months ago
by
akritivij
Certain models perhaps clogging up the leaderboard?, Check logs?
1
#25 opened 10 months ago
by
CombinHorizon
How are Faithfulness and Factuality calculated?
2
#22 opened 11 months ago
by
UjjwalP
How could #parameter of a model be 0?
2
#20 opened 12 months ago
by
zhiminy

Why is the score for RACE so low?
1
#18 opened 12 months ago
by
scinerd68
Adding German Faithfulness Detection Task
1
#16 opened about 1 year ago
by
mtc
Adding SummEdits to leaderboard?
1
#12 opened about 1 year ago
by
philippelaban
Adding tasks from the USB benchmark (for summarization)
1
#11 opened about 1 year ago
by
kundank
Adding the Snowball Hallucination detection datasets
#9 opened about 1 year ago
by
ofirpress
Longform QA
2
#8 opened about 1 year ago
by
shehzaadzd
Metrics for hallucination detection for summarization.
4
#6 opened about 1 year ago
by
rohitsaxena
Hello all!
#5 opened about 1 year ago
by
pminervini