Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
18
1
80
9-Volt Fan
9voltfan2009
Follow
21world's profile picture
tanooki426's profile picture
Linkario's profile picture
6 followers
ยท
7 following
AI & ML interests
None yet
Recent Activity
liked
a Space
about 3 hours ago
nari-labs/Dia-1.6B
reacted
to
shekkizh
's
post
with โค๏ธ
about 6 hours ago
Think AGI is just around the corner? Not so fast. When OpenAI released its Computer-Using Agent (CUA) API, I happened to be playing Wordle ๐งฉ and thought, why not see how the model handles it? Spoiler: Wordle turned out to be a surprisingly effective benchmark. So Romain Cosentino Ph.D. and I dug in and analyzed the results of several hundred runs. ๐ Takeaways 1๏ธโฃ Even the best computer-using models struggle with simple, context-dependent tasks. 2๏ธโฃ Visual perception and reasoning remain major hurdles for multimodal agents. 3๏ธโฃ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn ๐ ๐ Read our arxiv article for more details https://www.arxiv.org/abs/2504.15434
reacted
to
shekkizh
's
post
with ๐
about 6 hours ago
Think AGI is just around the corner? Not so fast. When OpenAI released its Computer-Using Agent (CUA) API, I happened to be playing Wordle ๐งฉ and thought, why not see how the model handles it? Spoiler: Wordle turned out to be a surprisingly effective benchmark. So Romain Cosentino Ph.D. and I dug in and analyzed the results of several hundred runs. ๐ Takeaways 1๏ธโฃ Even the best computer-using models struggle with simple, context-dependent tasks. 2๏ธโฃ Visual perception and reasoning remain major hurdles for multimodal agents. 3๏ธโฃ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn ๐ ๐ Read our arxiv article for more details https://www.arxiv.org/abs/2504.15434
View all activity
Organizations
models
2
Sort: Recently updated
9voltfan2009/DorkDiaries-RVC
Updated
16 days ago
9voltfan2009/WarioWare-RVC
Updated
Nov 20, 2023
โข
3
datasets
None public yet