autoevaluate/model-evaluator

#58 opened 11 months ago by

Michiga

Assertion error on executing from datasets import of load_dataset

#57 opened 12 months ago by

harshraj

Getting error on Extractive question answering evaluation of dataset openassistant-guanaco by timdettmers on 🤗model evaluator. Please help.

#56 opened about 1 year ago by

harshraj

Need help downloading the dolma data set

#55 opened about 1 year ago by

DoctorBoxer

UI Problems (and eval_info bug)

#54 opened about 1 year ago by

Please add ASR task to Autoevaluate and also allow to authenticate for datasets

#53 opened about 1 year ago by

RajSang

KeyError:'eval_info' on squad_v2

#52 opened about 1 year ago by

etweedy

ImageNet 401 Error

#51 opened about 1 year ago by

George-Ogden

No option to for Model Evaluation on 'marsyas/gtzan' dataset

#50 opened about 1 year ago by

vineetsharma

Autotrain server-side errors

4

#49 opened about 1 year ago by

Fix Repository not creating repo anymore

#48 opened about 1 year ago by

Wauplin

Remove version restriction on huggingface-hub

4

#47 opened about 1 year ago by

Upload الوحدة الثالثة.pdf

#46 opened over 1 year ago by

abdelrahman54902

My dataset can not be evaluated

5

#45 opened over 1 year ago by

Can you fix the build error, please?

#44 opened over 1 year ago by

ma2za

Upload CraftingAChangeMessageToCreateTransformationalReadiness(2002).pdf

#43 opened over 1 year ago by

DeltaVox

Is this actually working?

#42 opened over 1 year ago by

alvarobartt

Conversational evaluator

#41 opened over 1 year ago by

yicui

Huh

#40 opened over 1 year ago by

Foxxo

how to add dataset stacked-summaries/stacked-samsum-1024

#39 opened over 1 year ago by

pszemraj

Autoevaluate not triggering for squad_v2

#38 opened over 1 year ago by

sjrhuschlee

Update utils.py

#37 opened over 1 year ago by

Achitha

Update app.py

#36 opened over 1 year ago by

Achitha

ASR not a task option for google/FLEURS dataset

#35 opened over 1 year ago by

Ari

mozilla-foundation/common_voice_11_0 not showing as dataset

#34 opened almost 2 years ago by

jordimas

Gated datasets are not supported: HTTPError: 401 Client Error: Unauthorized for url: https://datasets-server.huggingface.co/is-valid

#33 opened almost 2 years ago by

jordimas

Location of text_zero_shot_classification code

#32 opened almost 2 years ago by

breakend

Delete notebooks/flush-prediction-repos.ipynb

#31 opened almost 2 years ago by

Maksi77777

Chain Of Thought Zero-Shot Prompting

#30 opened almost 2 years ago by

WillHeld

Evaluation is not consistent with local evaluation

#29 opened almost 2 years ago by

morenolq

Add Dataset for financial text classification

#28 opened almost 2 years ago by

nickmuchi

Add a new dataset, CondaQA, in Model Evaluator?

#27 opened almost 2 years ago by

anamarasovic

Run evaluation on private dataset

#26 opened almost 2 years ago by

phpthinh

Add object detection models

#25 opened almost 2 years ago by

timhigins

Availability to evaluate LLMs like in the HF blog post

3

#24 opened almost 2 years ago by

sjrhuschlee

Stereotyping Norwegian Salmon

#23 opened almost 2 years ago by

vlordier

Evaluate on Quora

#22 opened almost 2 years ago by

nickmuchi

Evaluating on SQuAD gives 404 Client Error

7

#21 opened about 2 years ago by

timbmg

Not getting pull request

3

#20 opened about 2 years ago by

Samuel-Fipps

Is it normal for it to take over a day for evaluating?

6

#19 opened about 2 years ago by

Samuel-Fipps

evaluation of same model on multiple datasets leads to too many metrics and results get difficult to read

#18 opened about 2 years ago by

MoritzLaurer

Evaluating same model on different splits of same dataset creates ambiguous evaluation

#17 opened about 2 years ago by

MoritzLaurer

Two successive evaluations on same model created conflicting readme.md

#16 opened about 2 years ago by

MoritzLaurer

HTTP issues using app

8

#15 opened about 2 years ago by

pszemraj

Multiple pull requests for the same dataset and model

#14 opened about 2 years ago by

grapplerulrich

Custom models with trust_remote_code=True

#13 opened about 2 years ago by

ccdv

I can't choose wmt16 datasets

5

#12 opened about 2 years ago by

Lvxue

I can't choose a model to evaluate

5

#11 opened about 2 years ago by

BDas

Segments-sidewalk