autoevaluate/model-evaluator

#58 opened almost 2 years ago by

Michiga

Assertion error on executing from datasets import of load_dataset

#57 opened almost 2 years ago by

harshraj

Need help downloading the dolma data set

#55 opened about 2 years ago by

DoctorBoxer

UI Problems (and eval_info bug)

#54 opened about 2 years ago by

ejschwartz

Please add ASR task to Autoevaluate and also allow to authenticate for datasets

👍 2

#53 opened about 2 years ago by

RajSang

KeyError:'eval_info' on squad_v2

#52 opened about 2 years ago by

etweedy

ImageNet 401 Error

#51 opened about 2 years ago by

George-Ogden

No option to for Model Evaluation on 'marsyas/gtzan' dataset

#50 opened about 2 years ago by

vineetsharma

Autotrain server-side errors

4

#49 opened about 2 years ago by

ejschwartz

Upload الوحدة الثالثة.pdf

#46 opened over 2 years ago by

abdelrahman54902

Can you fix the build error, please?

#44 opened over 2 years ago by

ma2za

Upload CraftingAChangeMessageToCreateTransformationalReadiness(2002).pdf

#43 opened over 2 years ago by

DeltaVox

Is this actually working?

#42 opened over 2 years ago by

alvarobartt

Conversational evaluator

#41 opened over 2 years ago by

yicui

how to add dataset stacked-summaries/stacked-samsum-1024

#39 opened over 2 years ago by

pszemraj

Autoevaluate not triggering for squad_v2

#38 opened over 2 years ago by

sjrhuschlee

ASR not a task option for google/FLEURS dataset

👍 1

#35 opened over 2 years ago by

Ari

mozilla-foundation/common_voice_11_0 not showing as dataset

👍 5

#34 opened almost 3 years ago by

jordimas

Gated datasets are not supported: HTTPError: 401 Client Error: Unauthorized for url: https://datasets-server.huggingface.co/is-valid

#33 opened almost 3 years ago by

jordimas

Location of text_zero_shot_classification code

#32 opened almost 3 years ago by

breakend

Chain Of Thought Zero-Shot Prompting

#30 opened almost 3 years ago by

WillHeld

Evaluation is not consistent with local evaluation

#29 opened almost 3 years ago by

morenolq

Add Dataset for financial text classification

#28 opened almost 3 years ago by

Add a new dataset, CondaQA, in Model Evaluator?

#27 opened almost 3 years ago by

anamarasovic

Run evaluation on private dataset

#26 opened almost 3 years ago by

phpthinh

Add object detection models

#25 opened almost 3 years ago by

timhigins

Availability to evaluate LLMs like in the HF blog post

3

#24 opened almost 3 years ago by

sjrhuschlee

Evaluate on Quora

#22 opened almost 3 years ago by

evaluation of same model on multiple datasets leads to too many metrics and results get difficult to read

#18 opened about 3 years ago by

MoritzLaurer

Evaluating same model on different splits of same dataset creates ambiguous evaluation

#17 opened about 3 years ago by

MoritzLaurer

Two successive evaluations on same model created conflicting readme.md

#16 opened about 3 years ago by

MoritzLaurer

Multiple pull requests for the same dataset and model

#14 opened about 3 years ago by

grapplerulrich

I can't choose a model to evaluate

5

#11 opened about 3 years ago by

BDas

Segments-sidewalk

#10 opened about 3 years ago by

ccdv/pubmed-summarization

#7 opened about 3 years ago by

Blaise-g

Request for Changes in UI

3

#6 opened about 3 years ago by

ghpkishore

Financial Phrasebank

#5 opened about 3 years ago by

Evaluate for speech models?

👍 2

3

#4 opened about 3 years ago by

patrickvonplaten

How does the space know whether a model is fine-tuned or not?

#3 opened about 3 years ago by

patrickvonplaten

Add queue to see which evaluations are running

👍 2

#2 opened about 3 years ago by

lewtun

"already evaluated" Bug?