Spaces:

hallucinations-leaderboard
/

leaderboard

Running on CPU Upgrade

App Files Files Community

Resources

View closed (14)

use_remote_code=True

#27 opened 11 days ago by

Accessing examples used for n-shot evals

#26 opened 2 months ago by

Certain models perhaps clogging up the leaderboard?, Check logs?

#25 opened 7 months ago by

How are Faithfulness and Factuality calculated?

#22 opened 9 months ago by

How could #parameter of a model be 0?

#20 opened 9 months ago by

Why is the score for RACE so low?

#18 opened 9 months ago by

Adding German Faithfulness Detection Task

#16 opened 10 months ago by

Adding SummEdits to leaderboard?

#12 opened 11 months ago by

Adding tasks from the USB benchmark (for summarization)

#11 opened 11 months ago by

Adding the Snowball Hallucination detection datasets

#9 opened 11 months ago by

Longform QA

#8 opened 11 months ago by

Metrics for hallucination detection for summarization.

#6 opened 11 months ago by

Hello all!

#5 opened 11 months ago by