Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
1
AbrutiNoir
Abru
Follow
0 followers
·
1 following
AI & ML interests
None yet
Recent Activity
reacted
to
ZennyKenny
's
post
with 🔥
8 days ago
Besides being the coolest named benchmark in the game, HellaSwag is an important measurement of здравый смысль (or common sense) in LLMs. - More on HellaSwag: https://github.com/rowanz/hellaswag I spent the afternoon benchmarking YandexGPT Pro 4th Gen, one of the Russian tech giant's premier models. - Yandex HF Org: https://huggingface.co/yandex - More on Yandex models: https://yandex.cloud/ru/docs/foundation-models/concepts/yandexgpt/models The eval notebook is available on GitHub and the resulting dataset is already on the HF Hub! - Eval Notebook: https://github.com/kghamilton89/ai-explorer/blob/main/yandex-hellaswag/hellaswag-assess.ipynb - Eval Dataset: https://huggingface.co/datasets/ZennyKenny/yandexgptpro_4th_gen-hellaswag And of course, everyone wants to see the results so have a look at the results in the context of other zero-shot experiments that I was able to find!
liked
a Space
10 months ago
archit11/gemma-10m
upvoted
a
paper
10 months ago
Mistral 7B
View all activity
Organizations
None yet
spaces
1
Sleeping
Mistralai Mixtral 8x7B V0.1
🌍
models
None public yet
datasets
None public yet