Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
1
5
Joshua Vendrow
jvendrow
Follow
timmhaucke's profile picture
1 follower
·
8 following
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 20 hours ago
Do Large Language Model Benchmarks Test Reliability?
liked
a dataset
7 days ago
madrylab/gsm8k-platinum
new
activity
9 days ago
madrylab/platinum-bench:
Grammatical error in squad task 5ad2b72fd7d075001a42a022
View all activity
Organizations
jvendrow
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
upvoted
a
paper
about 20 hours ago
Do Large Language Model Benchmarks Test Reliability?
Paper
•
2502.03461
•
Published
Feb 5
•
3