weiguosun

chuangxinlezhi
ยท

AI & ML interests

None yet

Recent Activity

Organizations

Newname's profile picture hapozzshylb's profile picture touxiong's profile picture

chuangxinlezhi's activity

New activity in chuangxinlezhi/1212121 4 days ago

Update README.md

1
#6 opened 4 days ago by
chuangxinlezhi
New activity in chuangxinlezhi/1212121 4 days ago

Update README.md

#5 opened 4 days ago by
chuangxinlezhi

Update README.md

1
#4 opened 4 days ago by
chuangxinlezhi

newstar

#3 opened 4 days ago by
chuangxinlezhi
New activity in shuttleai/shuttle-3-diffusion about 1 month ago

Update README.md

#12 opened about 1 month ago by
chuangxinlezhi

Update README.md

#11 opened about 1 month ago by
chuangxinlezhi
New activity in VTSNLP/instruct_general_dataset about 1 month ago

Update README.md

#3 opened about 1 month ago by
chuangxinlezhi
reacted to m-ric's post with ๐Ÿš€ about 2 months ago
view post
Post
1635
๐—”๐—ป๐—ฑ๐—ฟ๐—ผ๐—ถ๐—ฑ๐—Ÿ๐—ฎ๐—ฏ: ๐—™๐—ถ๐—ฟ๐˜€๐˜ ๐—ฒ๐˜ƒ๐—ฒ๐—ฟ ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ๐—ฎ๐˜๐—ถ๐—ฐ ๐—ฏ๐—ฒ๐—ป๐—ฐ๐—ต๐—บ๐—ฎ๐—ฟ๐—ธ ๐—ณ๐—ผ๐—ฟ ๐—”๐—ป๐—ฑ๐—ฟ๐—ผ๐—ถ๐—ฑ ๐—บ๐—ผ๐—ฏ๐—ถ๐—น๐—ฒ ๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€ ๐˜€๐—ต๐—ผ๐˜„๐˜€ ๐˜๐—ต๐—ฎ๐˜ ๐˜€๐—บ๐—ฎ๐—น๐—น, ๐—ณ๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ฒ๐—ฑ ๐—ผ๐—ฝ๐—ฒ๐—ป ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ ๐—ฐ๐—ฎ๐—ป ๐—ฝ๐—ผ๐˜„๐—ฒ๐—ฟ ๐—ฎ ๐—๐—”๐—ฅ๐—ฉ๐—œ๐—ฆ ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ ๐—ผ๐—ป ๐˜†๐—ผ๐˜‚๐—ฟ ๐˜€๐—บ๐—ฎ๐—ฟ๐˜๐—ฝ๐—ต๐—ผ๐—ป๐—ฒ ๐Ÿ“ฑ๐Ÿ”ฅ

A team from Tsinghua University just released AndroidLab, the first systematic framework to evaluate and train Android mobile agents that works with both text-only and multimodal models.

They show that fine-tuning small open-source models can significantly boost performance, matching that of much bigger closed models like GPT-4o.

The team built:

๐Ÿ“Šย A reproducible benchmark with 138 tasks across 9 apps to evaluate mobile agents systematically

๐Ÿ“๐Ÿ“ฑย A framework supporting both text-only (via XML) and visual (via marked screenshots) interfaces

โœ…ย An instruction dataset of 10.5k operation traces for training mobile agents

Key insights:

- ๐Ÿ“ˆ Fine-tuning improves performance BY A LOT: Open-source model Llama-3.1-8B improves from 2% to 24% success rate after training, nearly reaching GPT-4o performance although itโ€™s much smaller
- โš™๏ธ Text-only agents match multimodal ones: XML-based agents achieve similar performance to screenshot-based multimodal agents.

Read their paper here ๐Ÿ‘‰ AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents (2410.24024)
upvoted an article 4 months ago
view article
Article

Local AI with Docker's Testcontainers

By Tonic โ€ข
โ€ข 7
reacted to fdaudens's post with ๐Ÿ”ฅโค๏ธ๐Ÿค 5 months ago
view post
Post
1951
I just had a masterclass in open-source collaboration with the release of Llama 3.1 ๐Ÿฆ™๐Ÿค—

Meta dropped Llama 3.1, and seeing firsthand the Hugging Face team working to integrate it is nothing short of impressive. Their swift integration, comprehensive documentation, and innovative tools showcase the power of open-source teamwork.

For the curious minds:

๐Ÿ“Š Check out independent evaluations: open-llm-leaderboard/open_llm_leaderboard

๐Ÿง  Deep dive into the tech: https://huggingface.co/blog/llama31

๐Ÿ‘จโ€๐Ÿณ Try different recipes (including running 8B on free Colab!): https://github.com/huggingface/huggingface-llama-recipes

๐Ÿ“ˆ Visualize open vs. closed LLM progress: andrewrreed/closed-vs-open-arena-elo

๐Ÿค– Generate synthetic data with distilabel, thanks to the new license allowing the use of outputs to train other LLMs https://huggingface.co/blog/llama31#synthetic-data-generation-with-distilabel

๐Ÿ’ก Pro tip: Experience the 405B version for free on HuggingChat, now with tool-calling capabilities! https://huggingface.co/chat/

#OpenSourceAI #AIInnovation
  • 1 reply
ยท
updated a Space 5 months ago